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CHAPTER  1 


INTRODUCTION 

Using  a  mathematical  model  to  represent  some  real-world  phenomena  requires 
patience,  wisdom,  and  perhaps  most  of  all  knowledge.  All  such  models  have  parameters 
associated  with  them.  These  parameters  can  be  placed  into  two  categories:  input 
parameters  and  output  parameters  or  performance  measures.  An  input  parameter  is  any 
number  important  to  the  model  that  the  experimenter  must  provide.  An  example  of  an 
input  parameter  in  a  reliability  model  is  the  probability  a  system  component  functions;  in 
a  queueing  model,  an  example  of  an  input  parameter  is  the  rate  at  which  customers  arrive 
at  the  system.  Performance  measures  are  functions  of  the  input  parameters.  The 
experimenter's  goal  is  to  know  the  value  of  a  performance  measure  in  the  model. 
Knowledge  about  these  parameters  is  always  necessary  in  some  capacity  in  order  to  use 
the  model.  In  fact,  this  knowledge  can  be  critical  to  the  implementation  of  the  model  as 
the  following  example  illustrates. 

Suppose  customers  arrive  at  a  bank  that  has  only  one  teller.  If  the  teller  is  idle,  the 
arriving  customer  will  be  served,  otherwise  the  customer  will  join  a  queue.  Further 
assume  the  times  between  customer  arrivals  are  independent  exponential  random 
variables  with  mean  j  and  the  service  times  are  also  independent  exponential  random 
variables  with  mean  K  Consider  using  a  stationary  M/M/1  queue  to  model  this  system. 

In  a  stationary  M/M/1  queue  there  are  two  input  parameters,  the  arrival  rate,  A  and  the 
service  rate,  n.  It  is  imperative  to  know  the  value  of  p  —  ^ ,  the  ratio  of  the  arrival  rate  to 
the  service  rate.  If  this  ratio  is  not  less  than  one,  the  system  is  not  stationary  and  a 
stationary  model  is  therefore  not  appropriate.  If  the  system  is  stationary  and  the 
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performance  measure  of  interest  is  the  mean  waiting  time  in  queue,  W ,  then  one  can 
compute  this  output  parameter,  given  the  arrival  rate  and  service  rate: 


A 

tin  -  A)  ’ 


As  in  the  above  example,  if  the  model  is  stochastic  and  the  input  parameters  are 
parameters  of  probability  distributions  in  the  model,  then  they  are  unknown  and  must  be 
estimated  from  data.  Even  when  using  a  deterministic  model  such  as  linear  programming 
to  solve  an  optimization  problem,  the  coefficients  involved  could  be  unknown  and  would 
have  to  be  estimated.  The  experimenter  must  decide  how  to  best  spend  limited  resources 
in  trying  to  estimate  the  performance  measure.  For  the  M/M/1  queue,  for  example, 
should  he  allocate  the  resources  equally  in  sampling  from  the  arrival  time  distribution  and 
the  service  time  distribution,  or  should  he  concentrate  his  sampling  efforts  more  heavily 
in  one  distribution  or  the  other?  This  very  important  question  is  not  limited  to  this 
specific  example,  but  rather  is  pertinent  anytime  one  wishes  to  estimate  some 
performance  measure  of  any  model  involving  unknown  parameters  from  more  than  one 
distribution. 

Consider  the  problem  of  estimating  a  performance  measure  of  a  mathematical  model 
with  p  input  parameters,  v\,  u2, . . . ,  up.  Let  the  performance  measure  be  denoted  by 
/( vi,V2, ...  ,vp).  If  the  performance  measure  is  an  unknown  function  of  the  input 
parameters,  simulation  must  be  used  to  determine  the  form  of  f(ui  ,u2, ... ,  up).  Even  in 
the  special  case  where  the  performance  measure  is  a  known  function,  the  unknown 
parameters  i/j,  v2, . . . ,  up  in  the  model  must  be  estimated  in  order  to  obtain  an  estimate 
/  =  g(yi,  v2,...,Vp)  of  the  performance  measure.  This  requires  using  a  limited 
sampling  budget  to  obtain  the  estimates  vu  u2,...,uP.  The  experimenter's  goal  is  that 

the  estimator  is  "close"  to  the  true  performance  measure.  That  is,  he  wants 

- — ^ 
fiyu  • . . ,  Vp)  -  f  to  be  as  close  to  zero  as  possible.  Let  L(f  ,  /  )  be  a  loss  function 

that  measures  how  close  the  estimator,  /  ,is  to  the  true  performance  measure, 
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f{v\,V2 ,  •  •  • ,  z'p).  A  reasonable  and  mathematically  tractable  loss  function  is  the  mean 
squared  error  (MSE)  of  )  —  E^f  —  f'j  .  Given  some  budgetary  constraint, 

denoted  b,  the  goal  is  to  find  the  sample  allocation  that  minimizes  the  MSE  of  the 
estimator.  Let  V°(b)  be  the  MSE. 

Faced  with  a  mathematical  model  involving  p  parameters  from  q  different 
distributions,  2  <  q  <  p,  the  problem  can  be  stated  as  follows:  Let  X\j 
i  =  1, 2, . . .  ,q,j  =  1, 2, . ..  ,71*  be  i.i.d.  observations  from  population  i,  ci,  c2, . . . ,  cq  be 
unit  costs  for  sampling  from  populations  1 , 2, . . . ,  q  respectively,  m ,  n2, . . . ,  nq  be  the 
sample  sizes  for  populations  1 , 2, . . . ,  q  respectively,  and  b  >  0  be  a  pre  specified  total 
sampling  budget. 

Find  nopt , ... )  nopt  to: 

minE(J  -  f(uu  u2, . .. ,  up)^j 

s.t.  YjCiUi  <  b 
i— 1 

Assuming  this  optimization  problem  can  be  solved,  the  optimum  set  of  sample  sizes 
(n°pt,  nopt , ... ,  n°pt)  will  be  a  function  of  the  unknown  parameters  i/i,  u2, . . . ,  vp,  and 
thus  can  not  be  computed  without  knowing  v\,  i/2, . .. ,  up.  There  are  at  least  two 
approaches  to  solving  this  problem.  One  method  uses  a  Bayesian  approach  in  which  the 
experimenter  assumes  a  prior  distributional  form  for  each  of  the  q  distributions.  A  second 
approach  involves  a  multiple  stage  sampling  procedure.  For  example,  a  two  stage 
sampling  plan  allocates  a  small  portion  of  the  sampling  budget  in  stage  1  to  obtain  initial 
estimates  of  the  unknown  parameters.  These  estimates  can  then  be  used  to  compute  final 
estimates  of  ( n°xpt ,  n°2pt , . . . ,  n°pt) ,  denoted  (nf  ,nf,...,n*).  In  stage  2,  these 
estimated  "optimal"  sample  sizes  are  used  to  collect  the  final  samples.  Hopefully,  such  a 
multiple  stage  sampling  scheme  would  have  the  following  optimality  property:  As  the 
budget  grows  arbitrarily  large,  the  MSE  of  the  estimator  produced  by  the  multiple  stage 
sampling  scheme,  denoted  I^*(£>),  approaches  the  minimum  MSE,  V°(b),  using  the 
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optimal  sample  allocation.  This  multiple  stage  approach  to  determine  a  sampling 
allocation  is  useful  even  in  the  case  where  the  experimenter  does  not  know  the  value  of 
his  budget,  b,  exactly.  Letting  t°pt  —  n°pt/b,  the  experimenter  can  use 
(topt ,  top\  . . . ,  t°pt)  as  a  guide  to  allocate  the  undetermined  budget. 


CHAPTER  2 


LITERATURE  REVIEW 


The  problem  of  allocating  a  fixed  budget  among  populations  when  estimating  a 
function  of  parameters  was  originally  addressed  by  Ghurye  and  Robbins  (1959).  Ghurye 
and  Robbins  considered  the  problem  of  allocating  a  fixed  budget  among  two  Normal 
populations  when  the  function  of  interest  is  the  difference  in  the  two  population  means. 
That  is,  they  solved  the  following  problem:  Let  f(ni,  fi2)  =  ^1  -  £*2-  Find  (n°pt,  n°2pt) 
such  that  the  MSE  of  ^  XlnoPt  —  X2no&'\  is  minimized.  Using  the  difference  in  the 


sample  means  as  an  estimator  of  their  performance  measure,  they  showed  that  the 
variance  of  ^  XlnoPt  —  X2roPt )  is  minimized  by  an  allocation  proportional  to  the 


population  standard  deviations.  Since  the  population  standard  deviations  are  often 
unknown,  they  proposed  a  two-stage  sampling  plan  in  which  they  used  the  first  stage  to 
get  an  initial  estimate  of  the  population  standard  deviations.  In  the  second  stage,  they 
allocated  the  remaining  budget  in  an  optimal  fashion  using  estimates  from  the  first  stage. 
Their  two-stage  procedure  follows: 

Stage  One: 

(1)  Start  with  initial  random  samples,  Xn,  X12, . . . ,  Xlmi,  X2l,  X22, . . . ,  X2mv  of 
sizes  mi  and  m2,  where  cimi  4-  c2m2  <  b 

(2)  Compute  sample  estimates: 


tn-mx  i 

3=1 _ _ 

mi- 1 


,  i  =  1,2 


Stage  Two: 

(1)  Compute  nf  —  ^  cr^mi^ 

£<fcr,(m,) 

i=l 
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/ 

mi 

•  r  bnt  ^ 
lf  <  mi 

II 

6-c2m2 

if  b-c2m2 

<h- 

nf 

JL 

Cl 

Cl 

ci 

b  n * 

if  mi  < 

bnf 

^  b~c2m2 

k  ci 

Cl 

~  ci 

N*  = 

l  C2 

Ni  =  [N*] ,  N2  =  [TV*]  where  [x]  is  the  largest  integer  less  than  or  equal  to  x. 

(2)  Sample  (Nx  -  mx)  more  observations  from  population  1, 

Sample  (N2  —  m2)  more  observations  from  population  2 

(3)  Estimate  f(p,x,  p2)  =  p,x  -  p2  by  X1Nl  -  X2Nr 

Ghurye  and  Robbins  were  able  to  show  that  their  two  stage  sampling  plan  is  optimal  in 
the  sense  that  the  MSE  of  the  two  stage  estimator  approached  the  minimum  MSE  as  the 
budget  approached  infinity.  More  specifically,  they  proved: 

Theorem  2.1  (Ghurye  and  Robbins):  Let  Xg,  i  =  1, 2,  j  =  1, 2, . . . ,  Ni  be  i.i.d. 
observations  from  Normal (jui}  of)  and  fa  =  Xt.  Let  cx,  c2,  and  p  —  ^  remain  fixed 
while  mi,  m2  and  b  become  infinite  in  such  a  way  that 

jo<h<^<h  <  oo,  where  h,  h!  are  fixed 

tf-O  2  <  =  1,2 

Then  V*(b)/V°(b)  ->  1. 

Ghurye  and  Robbins  then  compared  their  two-stage  sampling  scheme  to  the  somewhat 
naive  approach  of  allocating  the  budget  equally  among  the  Normal  populations.  Their 
results  clearly  showed  the  superiority  of  the  two-stage  plan  for  values  of  p  >  1.5. 


7 


Ghurye  and  Robbins  also  showed  the  strict  assumption  of  Normality  is  not  necessary 
to  their  proof.  Letting  Ffo]  m)  =  Pr-fcf^m)  <  a}  and  Ai(e)  =  [oi  —  e,  Oi  -p  e],  one 
needs  only  assume: 

(I)  There  exists  an  a  >  1  such  that  for  every  fixed  e  >  0, 

ma  Pr{(7;(m)  ^  Afe)}  is  bounded  for  all  m  >  0,  i  =  1, 2; 

(II)  There  exists  an  e  >  0  such  that  e  <  min(<7i,  cr2)  and 

to akdFj(a;  to)  -  of  is  bounded  for  all  to  >  0  and  k  =  2,  —  2; 

(HI)  Either  X\(m)  and  cr,(m)  from  the  sample  are  a  pair  of  mutually  independent 

random  variables,  or  each  population  has  a  finite  fourth  moment. 

Finally,  they  showed  that  both  Poisson  and  Binomial  populations  meet  (I),  (II),  and  (H[). 

Page  (1990)  attacked  the  problem  of  allocating  a  fixed  budget  among  several 

populations  when  estimating  the  product  of  the  means  of  the  populations  using  the 

v 

product  of  sample  means.  That  is,  f(u j,  i/2,  •  •  • ,  vp)  =  II G  is  the  performance  measure 

i=l 

^  ^  ^  p 

of  interest  and  g(ui,  z^, . . . ,  up)  =  Yfti  where  is  the  sample  mean  from  population  i. 

i= 1 

She  showed  that  an  allocation  scheme  based  on  the  coefficients  of  variation  for  the 
populations  minimizes  a  first  order  approximation  to  the  variance  of  the  product  of  the 
sample  means  estimator.  Page  used  proportions  of  the  budget  rather  than  sample  sizes  to 
specify  an  allocation.-She  first  showed  that  an  allocation  exists  that  minimizes  the 
variance  of  the  estimator.  More  specifically  she  proved  the  following: 

Theorem  2.2  (Page):  Fix  b  and  let  t  —  (ti,  t2, ,  tp)  where  U  =  ^  and  Vi  =  — 
i  =  1, 2, ...,  p.  Then  Var^Jl^^j  *s  minimized  by  t  =  (fj,  t2,...,  tp)  with  0  <ti  <  1, 


satisfying  the  following  p  equations: 


Civfc'b-Hl 


wl1?2  j  =  -|9 

1  )  j 

and  JfU  =  1- 

i~  1 


<p- 1, 
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1 1/2 

She  also  proved  that  the  coefficient  of  variation  allocation,  t;  —  — - 

Zcfvj 


1,2, 


minimizes  a  first  order  approximation  to  Var  • 

Page  also  showed  how  to  improve  upon  a  balanced  allocation  scheme  where  each 
population  is  sampled  equally.  She  compared  different  allocations  by  a  measure  called 
asymptotic  relative  efficiency  (ARE).  ARE(f,  t)  is  interpreted  as  the  limiting  ratio  of 

■>  P 

budgets  needed  to  achieve  equal  variance  of  f] rji  with  allocations  t  and  t,  respectively. 

i~  1 

Though  the  coefficient  of  variation  allocation  is  optimal,  it  is  not  useful  in  practice 
because  the  Vi  are  typically  unknown.  To  deal  with  this  problem,  Page  considered  how 
partial  or  incomplete  information,  such  as  upper  and  lower  bounds  on  the  parameters,  can 
be  used  to  determine  an  allocation.  She  first  proved  that  movement  from  balanced 
allocation  in  the  direction  of  coefficient  of  variation  allocation  is  an  improvement.  If  cv- 
allocation  is  unknown,  she  provided  a  method  of  grouping  the  means  based  on 
incomplete  information  which  yields  upper  and  lower  bounds  on  the  cv-allocation.  In 
another  paper  (Page  1985),  she  focused  on  the  desirable  Bayesian  properties  of  such 
allocation  schemes. 

Zheng,  Seila  and  Sriram  (1997a)  looked  at  the  problem  of  estimating  the  product  of  p 
means  with  the  product  of  sample  means  using  a  frequentist  approach.  They  developed  a 
two  stage  sampling  procedure  similar  to  that  of  Ghurye  and  Robbins  and  were  able  to 
prove  the  asymptotic  optimality  of  their  sampling  scheme  under  some  mild  assumptions 
about  the  population  distributions.  Their  two  stage  sampling  plan  for  p  =  3  populations 
follows: 

Stage  One: 


(1)  Start  with  initial  random  samples,  Xu,  X12, . . .  ,Xlmo,  X2i,  X22, . . . ,  X2m0, 
and  Xzi,  X&,  ■■■ ,  Xzmo,  for  a  suitable  mo  defined  below. 
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(2)  Compute  sample  estimates: 

A1 1  -^lmo>  1^2  ~  -^2mo!  /^3  -^3mo> 

m0  _  0 

£{*»-  ■*■«,)’ 

=  “-15=! -  (*  =  1,2,3). 


Stage  Two: 

(1)  Let  n*  = 


V-iWi+WzOi+'ii-JhOi  ’ 


n 


★  _  _ 

2  »2Wi+V'iW2+ZiW3 
★  _  _ %£2£l _ 


,  and 


Ti  -  — -  ^  ^ 

3  ’ 


(2)  TV*  =  <( 


wo 

if 

n* 

< 

m0 

m0 

+ 

n*(b 

nf 

-3m0) 

if 

n* 

> 

mo, 

nf 

> 

w0, 

n* 

< 

m0 

m0 

+ 

n*(b 

«,*■ 

-3mo) 

if 

n* 

> 

m0, 

n{ 

< 

^0, 

> 

m0 

b  — 

2mo 

if 

n* 

> 

mo, 

n* 

< 

W0, 

< 

m0 

n* 

if 

n* 

> 

mo, 

n* 

> 

Wo, 

nt 

> 

m0 

iV2* 


W0 

if 

n* 

< 

m0 

Wo 

+ 

n*  (6~3mo) 
nf+n^ 

if 

n* 

> 

mo. 

nf 

> 

Wo, 

n* 

<  m0 

W0 

+ 

71*  (6  —  3771o) 
n*+n* 

if 

n* 

< 

mo, 

nf 

> 

Wo, 

nf 

>  mo 

2mo 

if 

n* 

< 

m0, 

n* 

> 

Wo, 

n* 

<  mQ 

n* 

if 

n* 

> 

m0, 

> 

Wo, 

n* 

>  m0 

N*  =  b-  N*  -  Nf 


(3)  Compute  the  final  sample  sizes:  jVx  =  [AT*],  iV2  =  [TV*],  TV3  =  [TV*], 
where  [a;]  is  the  largest  integer  less  than  or  equal  to  x. 


and 


10 

(4)  Sample  (Ni  —  mo)  more  observations  from  population  1,  ( N2  —  m0)  more 
observations  from  population  2,  and  (7V3  -  m0)  more  observations  from  population  3. 

(5)  Estimate  pip2p3  by  XlNi  X2Ni  X3Ny 

Specifically,  in  the  case  of  p  =  3  where  EXt  =  m  and  Var(Xi)  =  of,  they  proved  the 
following: 


Theorem  2.3  (Zheng,  Seila,  and  Sriram):  Assume  that  ph  p2  and  p3  are  all  positive. 
Define  Dfe)  =  ^  —  e,  ^  +  e  ,  Tft)  =  i  =  1, 2, 3.  Assume  populations  1,  2,  and 
3  satisfy: 

(1)  EXf  <  oo,  EX\  <  oo,  and  EXj  <  oo; 

(2)  There  exists  a  (3  >  1  such  that  for  every  e  >  0 

tPP[Ti{t)tDi(e)]  =  0{l),ast  — >  oo; 

(3)  There  exists  an  e  >  0  and  e  <  mini= ij2,3  such  that  for  each  i  =  1,2,3 

“  (S)  }  =  ast~^oo,fork=  -  4,4. 


Then  let  the  initial  sample  size  n0  =  ba  where  ^  <  a  <  -  and  0  satisfies  (2 ). 
For  sample  sizes  N\,N2,  and  N3  defined  by  their  two-stage  sampling  scheme  and 


ho  =  min{  2  —  a,  a(l  +  >0/2),  1  +  a}  we  /zave: 

E{  X1Nl  X2Ni  X3Ni  -  pip2pz)  =  V°(b)  +  O(^)  as  b 
Consequently, 

e(xINi  X2n2 

— - 70^ - - - ►  1  as  b  — >  oo. 
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In  addition  to  proving  the  asymptotic  optimality  of  their  sampling  scheme,  they 

performed  simulations  assuming  Normal  distributions  for  the  p  =  3  case.  Their 

experiments  demonstrated  the  superiority  of  the  two-stage  sampling  scheme  over  a  simple 

one-stage  approach  which  evenly  allocates  the  sampling  budget  among  the  three 

populations.  They  also  demonstrated  the  optimality  of  the  two-stage  scheme  by  showing 

that  the  variance  of  the  estimator  in  their  sampling  plan  approaches  V°(b)  as  b  gets  larger. 

Additionally,  they  tested  the  sensitivity  of  a,  and  thus  initial  sample  size,  by  running 

simulations  with  different  values  of  a.  The  results  showed  that  different  initial  sample 

sizes  do  affect  the  results,  but  this  effect  is  diminished  as  b  grows  larger.  In  a  previous 

paper,  Zheng,  Seila  and  Sriram  (1996)  showed  that  Normal,  Poisson,  Exponential  and 

Bernoulli  populations  meet  their  mild  assumptions. 

Zheng,  Seila  and  Sriram  (1997b)  broadened  the  range  of  models  by  studying  the 

problem  of  allocating  a  fixed  budget  among  the  arrival  and  service  time  distributions  of 

an  M/M/1  queue.  Zheng  and  Seila  (1996)  showed  that  the  substitution  estimator  for 

^  x2 

mean  waiting  time  in  queue,  WXq  =  — 2-2=r~,  which  is  obtained  by  substituting  J— 

aMj-  ^  °  xlni 

for  A  and  -==—  for  p  into  the  expression  for  mean  waiting  time,  W  (A,  u)  =  -  ,  has 

'''*■2712  ^  AHP  'v 

infinite  mean  squared  error.  This  result  was  already  known;  see  Schruben  and  Kulkami 
(1982). 


Theorem  2.4  (Zheng  and  Seila):  By  assuming  p  =  £  <  p0  <  1,  where  p0  is  known,  the 


alternative  estimator 


x 2 

A2  n2 


X 


In, 


-x* 


“1  ^2 
Po  ~Y 
1-P0  ^2712 


if  n2  <  Po  Xln^ 
otherwise 


has  the  following  properties : 


Wn 


a.s. 


Wr 


q  as  n1}n2  oo, 


EW,  =  W,  +  0{±)  +  0(±),  and 
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E(W,~W,)  -^i  +  ^i  +  0($)  +  0($) +  £>(£). 

Using  this  alternative  estimator,  Zheng,  Seila,  and  Sriram  developed  the  following  two 
stage  sampling  scheme: 

Stage  One: 

(1)  Start  with  initial  random  samples,  Xu ,  Xn, . . . ,  Xlno  of  interarrival  times, 
and  X2i ,  X22 ,  •  •  • ,  AT2noof  service  times,  for  a  suitable  n0  defined  below. 

(2)  Compute  sample  estimates: 


A  = 


_  1 


x 


ln0 


and  ju  = 


V2n0 


Stage  Two: 

(1)  Let  nf  = 


b  Jl 


nf  = 


62  (ji -X) 

cifi+cfc%22(jl-- 1 


(2)  N* 


if  n*  <  fty 
if  n*  >  b-^ 

1  —  Ci 

if  no  <  nf  <  b~^-n° 


jy*  __  b-ciNy 
2  c2 


(3)  Compute  the  final  sample  sizes:  Ni  -  [TV,*],  N2  =  [N*]  where  [x]  is  the 
largest  integer  less  than  or  equal  to  x. 


(4)  Take  (Ni  -  n0)  more  interarrival  time  observations  and  (N2  -  n0)  more 
service  time  observations. 


Specifically,  they  proved  the  following  optimality  property  for  this  estimator: 
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Theorem  2.5  (Zheng,  Seila,  and  Sriram):  Let  a.  e  ^0.5, 1  —  and 

ho  =  min(  2  -  a,  1  +  a).  Then  for  their  two-stage  procedure  and  initial  sample  size 
n0  =  ba  : 

(1)  E{Wq-Wq)2  =  V°(b)+0{±), 

(2)  The  optimal  value  of  a  is  between  0.5  and  1  —  ,  and  ho  >  1.5,  for  any 

q€(0.5,1-S^). 

Zheng,  Seila,  and  Sriram  ran  simulations  to  support  their  theoretical  results.  Their 
empirical  work  clearly  showed  the  MSE  of  their  two  stage  estimator  tends  to  the 
minimum  variance  as  the  sample  size  gets  larger  and  it  only  takes  relatively  small  sample 
sizes  to  get  reasonably  close.  Additionally,  their  simulations  showed  that  the  rate  of 
convergence  slows  as  p,  the  traffic  intensity,  increases. 

The  results  were  extended  to  three  other  system  performance  measures  of  the  M/M/1 
queue:  mean  waiting  time  in  system,  mean  number  of  customers  in  the  system,  and  mean 
number  of  customers  in  the  queue.  The  results  were  also  broadened  to  include  the 
M/Efc/1  queue. 


CHAPTER  3 


A  RELIABILITY  MODEL 

We  will  consider  the  problem  of  allocating  a  sample  to  estimate  the  following 
function  of  three  population  means  by  the  corresponding  function  of  sample  means: 

=  Ml(M2  +  M3)  (3.1) 

The  problem  is  to  allocate  a  fixed  sampling  budget  to  the  three  populations  with  a  goal  of 
minimizing  the  MSE  of  the  estimator.  Let  Xn ,  Xi2, ...  be  i.i.d.  observations  from 
population  i  with  unknown  mean  Mi  and  variance  of  ,i  —  1,2,3.  Assume  observations 
from  the  three  populations  are  mutually  independent.  We  wish  to  determine  optimal 
sample  sizes  {n°pt,  nopt,  nopt)  which  minimize  the  first  order  approximation  of  the  MSE 

Ki ,n2,n3  —  E(  Xini(X2 n2  +  ^3n3)  ~  Ml(M2  +  ’  (3-2) 

subject  to  the  constraint  that  the  total  sampling  cost  is  Cini  +  c2n2  +  c3n3  <  b,  where 
Cjis  the  unit  cost  for  sampling  the  i-th  population,  n.  is  the  sample  size  for  the  ith 
populations  and  b  is  a  pre  specified  total  sampling  budget. 

3.1  First  Order  Allocation 

Using  a  Taylor  series  expansion,  one  can  show  that 

Kii,n2,n3  ^  l^ni.nj.ns  +  0((niri2)  1)  +  0((riin3)  *),  (3.3) 

where 

VnM  =  (M2  +  M3)2^  +  Mi~  +  lA—  (3.4) 

ni  n2  n3 

is  the  first  order  approximation.  The  values  {n[vt ,  n°2pt,  nopt),  referred  to  as  the,  first-order 
allocation ,  which  minimize  VnuTl2>n3  are 
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'  nf  = 

<  nf  = 


Therefore,  one  can  show  that  the  lower  bound  for  the  first  order  approximation  of  the 
MSE  is  given  by 

Vlow  =  =  V°(P)  +  ^(^)>  (3-6) 

with 

V°(b)  =  Vnf  ,nf  ,nf ■  (3-7) 

Because  the  first-order  allocation  (nf  ,nf  ,nf )  depends  on  the  unknown 
parameters  /ii,  /i2.  M3>  cr2,  and  cr3,  we  will  propose  a  two-stage  sampling  procedure 
similar  to  that  in  Zheng,  Seila,  and  Sriram  (1997a)  and  establish  its  optimality  properties 
as  the  total  budget  goes  to  infinity. 

3.2  Two  Stage  Sampling  Procedure 

The  objective  of  the  two-stage  procedure  is  to  determine  the  final  sample  sizes  that 
minimize  the  first-order  approximation  of  the  MSE  in  (3.4)  subject  to  the  fixed  overall 
budget  (sample  size).  The  procedure  works  as  follows:  In  stage  1  we  select  equal-sized 
samples  from  each  population.  Then  we  estimate  nf ,  nf ,  and  nf  in  (3.5)  by  nf  nf 
and  nf  using  sample  estimates  of  /z i,  /z2,  /z3,  a1,  a2,  and  <r3  computed  from  the  initial 
sample.  In  stage  2,  we  allocate  the  remaining  budget  based  on  the  stage  1  results.  The 
detailed  procedure  follows: 

Stage  One: 

(1)  Start  with  initial  random  samples  Xn,Xl2,  ...,Ximo,  X2i,X22, ...  ,X2mo,  and 
X31,  X32, X3mo,  for  a  suitable m0  to  be  defined  below. 


Cl  Cj(/i2+Zi3)+C2/li<72+C3WCr3 

_ by  l£2 _ 

Cl  C1(ft,+/l3)+C2/il<r2+C3Ml<73 

_ bjh£3 _ 

Cl  ^1(/l2+M3)+C2^i<T2+C3AilC3 


(3.5) 


(2)  Compute  sample  estimates: 
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1,2,3, 


~j  -  *  IITIq  ) 

.  _  2 


£  (x.j-  ximoY 
Mm o)  -  —  =-i -  (*  =  1, 2, 3) 


(3)  Let  n* 


_ bjifi-i _ 

Ci  <71(?2+/i3)+c2Ml(r2+C3A'l53  ’ 


n 


★  _ 


6/^3 


Cl  CT1(^2+M3)+C2M1CT2+C3lilS:3  ' 


(3.8) 


Stage  Two: 

(1)  Let 


N*  = 


(  mo 

if 

n* 

< 

too 

™  ,  nf(6-(ci+c2+c3)7no) 

0  +  (Cln*+c2n*) 

if 

n* 

> 

m0, 

n? 

> 

m0, 

n* 

< 

too 

™  i  n*(6-(ci+c2+c3)m{,) 

I  °  (cln* +C371*) 

if 

n* 

> 

mo, 

nf 

< 

mo. 

n* 

> 

m0 

(b- 

(C2  +  c3)m0)/ci 

if 

n* 

> 

m0) 

n* 

< 

mo. 

nf 

< 

m0 

<nl 

if 

n* 

> 

m0. 

n* 

> 

mo. 

n* 

> 

m0 

(3.9) 


w* 


f  mo 

if 

nf 

< 

m0 

™  ,  n^(6-(c1+c2+c3)m0) 

0  (c2n^+C!nf) 

if 

n* 

> 

m0. 

n} 

> 

mQ, 

nf 

< 

to0 

™  i  n^(b-(ci+c2+c3)mo) 
j  (c2n* +c3n*) 

if 

n* 

< 

too, 

nf 

> 

mo, 

nf 

> 

m0 

(6- 

(ci  +  c3)m0)/c2 

if 

n* 

< 

m0, 

n* 

> 

mo, 

n* 

< 

mo 

if 

n* 

> 

mo, 

n* 

> 

m0, 

n* 

> 

mQ 

(3.10) 


and 


N* 


(b-  ciN*  -  c2N*)/cz 


(3.11) 


Ni  =  [N?],  N2  =  {N*},  N3  =  [b  -  Ni  -  N2], 


(3.12) 


where  [a:]  is  the  largest  integer  less  than  or  equal  to  x. 


(2)  Sample  (iVj  —  m0)  more  observations  from  population  i,i  =  1,  2, 3. 

(3)  Finally,  estimate  /^i (ju2  +  te)  by  Xm(X2N2  +  Xm). 
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(3.13) 


In  step  (1)  of  stage  two,  the  estimated  optimal  values  nf,  n*,  and  n*  computed  using 
(3.8)  may  give  rise  to  the  following  three  cases:  (i)  only  one  of  them,  say  n*,  is  greater 
than  m0;  (ii)  exactly  two  of  them,  say  nf  and  nf,  are  greater  than  m0;  and  (iii)  all  three 
of  them  are  greater  than  m0.  Note  that  all  these  possibilities  are  identified  in  sets  Ai  to 
A7  in  Chapter  6  of  this  thesis.  In  case  (i),  all  additional  observations  are  sampled  from 
population  1  since  the  stage  1  sample  sizes  for  populations  2  and  3  are  larger  than  the 
estimated  optimal  sample  sizes  for  these  populations.  In  case  (ii),  all  additional 
observations  are  sampled  proportionally  from  populations  1  and  2  since  the  stage  1 
sample  size  for  population  3  is  larger  than  the  estimated  optimal  sample  size  for  this 
population.  Finally,  in  case  (iii),  take  nf ,  ri£,  and  rif  to  be  the  final  sample  sizes  since 
all  of  them  exceed  the  stage  1  sample  sizes. 

3.3  Optimal  Properties  of  the  Two  Stage  Procedure 

Theorem  3.1  below  shows  that  the  two  stage  procedure  defined  in  (3.8)  to  (3.12)  is 
asymptotically  risk  efficient.  That  is,  as  the  budget  grows  arbitrarily  large,  the  MSE  of 
the  estimator  in  (3.13)  approaches  the  minimum  MSE,  V°(b )  from  (3.7).  Before  we 
state  the  theorem,  we' give  a  list  of  sufficient  conditions  needed  to  prove  the  theorem. 
Assume  that  nu  ^2  and  p,3  are  all  positive.  Let  IA  denote  the  indicator  function  of  a  set 
A  and  for  e  >  0,  define 


aw  = 


AW  = 


AW  = 


Plfo+A)) 

—  C 

e, 

£3  _  £  £3 

+  e 

<T2  5  (72 

^l(^2+w) 

Ml  ^2 

6j 

Atifo+As) 


gllf*2+/f3) 


(3.14) 

(3.15) 

(3.16) 


Ml  ^2 
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D4(6) 

= 

—  -  e  22  +  e 

<r3  ff3  T 

(3.17) 

Ds(e) 

= 

Crl(M2+M3)  c  0"i(M2+M3)  I  r 

1*103  ’  Hl°3  ' 

(3.18) 

Ri  = 

^l(cT2+cr3) 

(3.19) 

r2  = 

a 

^2 

(3.20) 

Rz  = 

Ml^ 

(3.21) 

= 

£2 

<^3 

(3.22) 

R5  = 

^1  (M2+M3) 

Mi 

(3.23) 

Ti(  n) 

_  A'in(S2„4-53ri) 

^lr;(-^2n+^3n) 

(3.24) 

r2(n) 

_ _  £>2n 

~  S2n 

(3.25) 

Is(n) 

c 

hn{^2n+Xzn) 

Xln^n 

(3.26) 

T4(n) 

11 

(3.27) 

3i(n) 

_  £ 

hn{X2n+Xzn) 

XlnSzn 

(3.28) 

The  sufficient  conditions  are: 

(1)  EXf  <  oo ,  *  =  1, 2, 3;  (3.29) 

(2)  There  exists  a  /?  >  1  such  that  for  every  e  >  0 

t0P[Ti(t)  i  Di{e)}  =  0(1),  *  =  1,2, 3, 4, 5,  as  t  -*  oo;  (3.30) 


(3)  There  exists  an  e  >  0  and  e  <  min!=i)2i3l4,5  Ri  such  that  for  each  i  =  1,2, 3, 4,  5 


f{  /  Ti(t)dP  -  R![}  —  0(1),  as  t  ->  oo;  for  k  =  -  4,4.  (3.31) 

J[Ti(t)eDi(e)} 
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Theorem  3.1  (Risk  Efficiency):  Suppose  that  populations  1,  2  and  3  satisfy  conditions 
(3.29)  to  (3.31).  For  a  (3  satisfying  (3.30),  let  ^  <  a  <  1  -  ^  and  assume  that  the 
initial  sample  size  rn0  =  ba.  For  N2  defined  in  (3.12)  and 
Hq  =  min{ 2  —  a,  q(1  +  |),  1  +  ck}  we  have 

E{X1Ni(X2n2  +  X^Nz)  ~  Fi(F2  +  Pz))2  =  V°(b)  +  O(jfc),  as  b  —>  oo,  (3.32) 
where  V°(b)  is  defined  as  in  (3.7). 

Proof:  See  Chapter  6. 


It  can  be  shown  that  Normal,  Poisson,  Exponential,  and  Bernoulli  populations  satisfy 
the  conditions  in  (3.29)  to  (3.31)  for  any  (3  >  1.  For  example  see  Zheng,  Seila,  and 
Sriram  (1995).  For  discrete  populations,  such  as  Poisson  and  Bernoulli,  one  needs  to 
modify  the  two  stage  procedure  slightly  to  account  for  the  fact  that  the  sample  means  and 
variances  could  be  zero  with  positive  probability.  One  need  only  use  the  following 
modified  estimators. 

f  X-  ifXim=0 

Fi  =  4  ^  °.  (3.33; 

^  X imQ  otherwise 


rriQ 


if  *i  =  0 
otherwise 


The  estimators  in  (3.33)  were  used  in  the  simulations  described  in  the  next  section.  This 
modification  is  important  because  if  cfa 1(ju2  +  ju3 )  +  C2/i1o:2  +  cfjlfd-i  —  0,  n*  are 
undefined.  The  modification  also  solves  a  more  subtle  difficulty.  For  example,  if 
Pi  =  0.96,  p2  =  0.5,  pi  =  0.99,  b  =  800,  and  the  populations  are  binomial,  then  the 
optimal  allocation  is  (269.26,  442.65,  88.09).  If  a  failure  does  not  occur  in  population  3 
in  the  initial  sample,  s3  =  0  and  therefore  n*  =  0,  a  poor  estimate  of  n3pt  indeed. 

3.4  Empirical  Results 

Suppose  we  want  to  estimate  the  function  in  (3.1)  where  the  unknown  means  are  from 
three  Bernoulli  populations.  In  this  situation,  a  one  stage  sampling  scheme  might  divide 
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the  given  total  sampling  budget  b  as  follows:  sample  [|]  observations  from  population  1, 
and  [|]  observations  each  from  populations  2  and  3.  Then  the  one  stage  estimator  of 
£0(/42  +  H 3)  is  Xl[b_](X2[t]  +  We  present  some  simulation  results  that 

demonstrate  the  performance  of  our  two  stage  estimator  (based  on  the  two  stage  sampling 
scheme)  is  better  than  the  one  stage  estimator.  Furthermore,  we  demonstrate  the 
asymptotic  risk  efficiency  of  our  two  stage  estimator. 

Note  that  the  MSE  of  the  one  stage  estimator  if  |  is  an  integer  is 

Vone  =  2(/J2  +  ^z)2  +  ^(^2  ))  ^  b  — ►  00. 


Let  Vtwo  denote  the  MSE  of  the  two  stage  procedure  (see  3.32)  and  let  Viow  be  as  defined 
in  (3.6).  For  the  simulations  we  assume  Bernoulli  populations  with  jii  =  0.99, 

H2  =  0.45,  and  ju3  =  0.3.  We  use  equal  sampling  cost  for  each  population,  that  is  c*  =  1 
for  i  =  1,2,3.  The  simulation  was  carried  out  using  a  31-bit  prime  modulus 
multiplicative  congruential  random  number  generator  with  modulus  231  -  1.  The  random 
number  generator  uses  the  multiplier  742938285.  Turbo  Pascal  was  the  programming 
language  used. 

Let  V one  be  the  estimator  of  Von e  and  V two  be  the  estimator  of  Vtwo >  based  upon  77- 
replications.  Table  3.1  gives  values  for  Vone/Viow  and  values  for  Vtwo/Vtow  using 
a  =  0.5  along  with  its  95%  confidence  intervals  for  different  values  of  b.  We  used 
10,000  iterations  to  get  these  values.  The  limiting  value  of  Vone/Viow  is  1.732.  From  this 
table  we  see  that  Vtwo/Vlow  is  smaller  than  Vone/Viow  thus  demonstrating  the  superiority 
of  the  two  stage  sampling  procedure  over  the  one  stage  plan.  We  also  see  from  the  table 
that  Vtwo/Viow  becomes  closer  and  closer  to  one  as  b  increases.  This  shows  the  MSE  of 
our  two  stage  estimator  approaches  the  theoretical  minimum  MSE  as  b  — >  oo,  providing 
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empirical  evidence  for  the  results  proved  in  Theorem  3.1.  Figure  3.1  gives  the  graphs  for 
the  results  in  Table  3.1. 


Table  3. 1 :  Optimality  of  Two  Stage  Estimator  with  10,000  Iterations 


b 

20 

40 

80 

100 

200 

400 

800 

V  one/Vlow 

1.691 

1.748 

1.726 

1.715 

1.754 

1.744 

1.706 

OK 

6 

o 

o 

6 

1.438 

1.373 

1.206 

1.176 

1.145 

1.061 

1.027 

95%  Cl  Ubd 

1.479 

1.412 

1.240 

1.208 

1.176 

1.091 

1.055 

95%  Cl  Lbd 

1.398 

1.324 

1.170 

1.143 

1.113 

1.032 

1.000 

1  Stage  Vs  2  Stage  Procedure 


b  =  sampling  budget 


**•  One  Stage  Procedure 
95%  LCB 

~Ar  Two  Stage  Procedure 
95%  UCB 


Figure  3.1 

We  also  conducted  simulations  to  test  the  effect  of  different  values  of  a.  and  thus 
different  initial  sample  sizes.  Table  3.2  gives  VtWo/Vlow  and  its  95%  confidence  intervals 
for  different  values  of  a  and  b\  figure  3.2  displays  this  information  graphically. 
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b  =  sampling  budget 


m  alpha  =  0.4 
alpha  =  0.5 
-A-  alpha  =  0.6 


Figure  3.2 

Table  3.2  and  Figure  3.2  indicate  there  is  no  clear  preference  among  the  different  values 
for  a  for  all  budget  sizes;  however,  the  smaller  values  of  a  seem  to  perform  better  at 
small  values  of  b  and  a  =  0.6  seems  to  be  a  better  choice  for  larger  values  of  b. 

Table  3.3  gives  Vtwo/Viow  and  its  95%  confidence  intervals  for  different  values  of  ol 
and  b  for  a  different  set  of  population  means.  For  these  simulation  results,  —  0.99, 

(j,2  =  0.95,  and  /i3  =  0.9.  The  value  for  a  labeled  "variable"  is  determined  by  the 
function  a  =  (0.5  +  1  -  ln(3)/ln(6))/2.  This  allows  the  value  of  a  to  change  along  with 
b.  Figure  3.3  displays  the  information  in  Table  3.3  graphically. 


Table  3.3  Estimated  Vtwo/Viow  and  Its  95%  Cl  with  10,000  Iterations 
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40  80  100  200  400  800  1000  2000  4000  8000 

b  =  sampling  budget 


-*  alpha  =  0.5 
-±-  alpha  =  0.6 
variable 


Figure  3.3 

Table  3.3  and  Figure  3.3  indicate  a  =  0.5  is  the  best  choice  for  small  values  of  b  while 
using  a  "variable"  value  for  a  is  the  best  choice  for  values  of  b  between  80  and  2000.  For 
larger  values  of  b,  a  =  0.6  seems  to  be  the  best  choice.  Table  3.3  and  Figure  3.3  also 
indicate  the  convergence  of  V two  to  Viow  is  slower  when  the  population  means  are  close 


CHAPTER  4 


THE  M/G/l  QUEUEING  MODEL 


We  will  consider  the  problem  of  optimally  allocating  a  sampling  budget  between  the 
interarrival  times  and  service  times  for  the  stationary  M/G/l  queue  with  the  goal  of 
minimizing  the  mean  squared  error  of  an  estimator  of  the  mean  waiting  time  in  queue.  Let 
Xu, X\2,  •  ••  be  i.i.d.  observations  from  the  interarrival  time  distribution  with  mean  (3 and 
X2\ 1  X22,  •••be  i.i.d.  observations  from  the  service  time  distribution  with  mean  r  and 
variance  a2.  Assume  observations  from  the  two  distributions  are  mutually  independent. 
Additionally  assume  the  traffic  intensity  p  ~  P  <  Since  p  <  1,  the  system  is 
stationary  and  the  mean  waiting  time  in  queue  can  be  computed.  This  is  the  performance 
measure  we  will  estimate: 


W((3,r,a2) 


a2  +  t2 

2(p  -  t)  ’ 


(4.1)  is  taken  from  expression  (8.33)  in  section  8.5  of  Ross  (1993)  with  j  =  ft- 
4.1  Properties  of  Estimators  of  Mean  Waiting  Time 

In  the  case  of  the  M/M/1  queue,  Zheng,  Seila,  and  Sriram  (1997b)  showed  that  the 
substitution  estimator  for  the  mean  waiting  time  in  queue  has  infinite  MSE.  The  problem 
of  estimating  mean  waiting  time  in  the  M/G/l  queue  is  different  from  that  in  the  M/M/1 
queue  because  in  the  M/M/1  queue  the  service  time  distribution  has  only  one  parameter, 
the  mean,  to  be  estimated.  In  the  M/G/l  queue,  we  will  need  to  estimate  two  parameters 
from  the  service  time  distribution,  the  mean  and  variance:  r  and  cr2.  Still,  the  natural 
estimator  of  the  mean  waiting  time,  Wq  a2^ ,  obtained  by  substituting  Xlni  for  the 

—  „  £  (Xi-  x2n2) 

mean  interarrival  time,  W2n,  for  the  mean  service  time,  and  s 2  =  — - ^ - for  the 

72-2  —  1 
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variance  of  the  service  times,  has  infinite  mean  squared  error.  The  following  theorem 
establishes  this  fact.  The  proof  is  given  in  Chapter  6. 

^  S2+x2 

Theorem  4.1:  LetW  i„  =  Then  the  following  hold: 

"i  Z\Xini  —  X2n2  ) 


-E’l^lgl  =  +  00, 


(4.2) 


E{Wlq  -  Wqf  =  +  oo. 


(4.3) 


Since  W \q  has  infinite  mean  squared  error,  it  is  not  useful  to  estimate  mean  waiting  time 
in  the  queue,  especially  since  our  objective  is  to  find  the  sample  allocation  with  minimum 
mean  squared  error.  We  propose  the  following  alternative  estimator:  Let  p0  <  1  and 
assume  that  p  <  po.  Define 


'  s2+x2 


2 7i2 


W„  =  < 


2(Xlni-X2„2) 


2(1  -Po) 


if  -^2n2,  <  P0Xlnu 
otherwise 


(4.4) 


The  following  theorem  establishes  the  statistical  properties  of  the  alternative  estimator, 
Wq  as  defined  in  (4.4).  The  proof  is  given  in  Chapter  6. 


Theorem  4.2:  For  the  estimator  defined  by  (4.4)  in  an  M/G/l  queue  with  traffic  intensity 
p  <  po  <  1,  where  po  is  known, 

EWq  =  Wq  +  0(— )  +  0(— ),  (4.5) 

ni  n2 


E{Wq  -  Wqf 


(f±T  Y  f 

4(/3-r)4  Tij 


+ 


(2§t- f+ffcr2 


4(/5— r)4  n2  '  4{j3-r)2 


+ 


k4— a4  \ 

n2  J 


(4.6) 


where  k3  and  rf  are  the  third  and  fourth  central  moments  respectively  of  the  service  time 
distribution. 

4.2  First  Order  Allocation 


- — •«. 

Therefore,  using  Wq  to  estimate  the  mean  waiting  time,  our  goal  is  to  determine 
optimal  sample  sizes  ( n°pt ,  n°2vt )  which  minimize  the  first  order  approximation  of  the 
MSE  in  (4.6): 


ro 

v  ni  ,7&2 


(ff2+r)2  /32  .  (2 /?t-t2+(72)2  a \  .  1  /  K4-q4A  ,  2 ffr-r2+g2  K3 

4 (/3-r)4  ^  ^  4(/?-t)4  n2  ~r  4(/?-r)2  V  n2  /  ^  2(/?-r)3  n2  5 


(4.7) 


subject  to  the  constraint  that  the  total  sampling  cost  n\  +  <  b,  where  ni  is  the  sample 

size  for  the  i-th  population  and  b  is  a  pre  specified  total  sampling  budget.  The  values 
{n0^1 1n°2>t),  referred  to  as  the  first-order  allocation  which  minimize  (4.7)  are 


^opt  _  b/?(r2+£72)(A— /?(r2+cr2)) 

nl  —  # 

opt  6A(A-/?(r2-fcr2)) 

n2  ~~  f 


(4.8) 


where 

A2  =  /32 (4 k3t  +  k4  +  4 r2a2  -  a4)  -  2/?(/c3(3t2  -  a2)  +  t(k4  +  a2(2 r2  -  3a2))) 

+  2  kzt(t2  —  a2)  +  «4r2  +  t4ct2  —  3r2a4  +  cr6  (4.9) 

and 

$  =  (32(4k3t  +  k4  —  t4  +  2a2  (r2  —  cr2))  —  2/5(/c3(3t2  -  a2) 

+  t(k4  +  a2(2r2  —  3a2)))  +  2  k3t(t2  —  a2)  +  re4r2  +  a2(r4  —  3r2a2  +  a4). 
Therefore,  one  can  show  that  the  lower  bound  for  the  MSE  is  given  by 

Vlow  =  Vn°r\nf  =  V°(b)  +  O(^), 


(4.10) 
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4.3  Two  Stage  Sampling  Procedure 

Because  the  first-order  allocation  (nopt ,  nopt)  depends  on  the  unknown  parameters  (3, 
r,  and  a2,  we  will  propose  a  two-stage  procedure  similar  to  that  in  Zheng,  Seila,  and 
Sriram  (1997b)  and  establish  its  optimality  properties  as  the  total  budget  goes  to  infinity. 

The  objective  of  the  two-stage  procedure  is  to  determine  the  final  sample  sizes  for  the 
two  populations  that  minimize  the  first-order  approximation  of  the  MSE  in  (4.6)  subject 
to  the  fixed  overall  budget  (sample  size).  The  procedure  works  as  follows:  In  stage  1  we 
select  equal-sized  samples  from  each  population.  Then  we  estimate  n°pt  and  nopt  in  (4.8) 
by  nf  and  nf  using  sample  estimates  of  /3,  r,  and  a2  computed  from  the  initial  sample. 
In  stage  2,  we  allocate  the  remaining  budget  based  on  the  stage  1  results.  The  detailed 
procedure  follows: 

Stage  One: 

(1)  Start  with  initial  random  samples  Xn,  Xn, Ximo  and  X21,  X22, ...,  X2mo 
for  a  suitable  mo  to  be  defined  below. 

(2)  Compute  sample  estimates: 

~  _  _  E  [Xi-  x2mo) 

P  =  Ximo,  t  =  X2mo,  a2 (mo)  =  1-1  - . 


(3)  Let  nj*  = 


★  _  6/?(r  +9  )( A-j9(r2+g  )) 


no 


<I> 

frA(A-/?(r2+g2)) 

<S  _  ’ 


,  and 


(4.12) 


where  A  and  $  are  as  in  (4.9)  and  (4.10)  respectively  with  /3,  r,  and  a2  replaced 
by  their  respective  estimators. 


Stage  Two: 

(1)  Let 
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and 


Nf 


T ★  — 


mo 

1 A 

3 

o 

b  —  mo 

3 

IV 

o 

(4.13) 

nf 

if  mo  <  n*  <  b  —  m o 

-N* 

(4.14) 

Ni  =  [N*],  N2  =  b-  N2, 

(4.15) 

where  [x]  is  the  largest  integer  less  than  or  equal  to  x. 

(2)  Sample  (jV*  —  m0)  more  observations  from  population  i,i  —  1,  2. 

(3)  Finally,  estimate  Wq  by  Wq. 


4.4  Optimal  Properties  of  the  Two  Stage  Estimator 

Theorem  4.3  below  shows  that  the  two  stage  procedure  defined  in  (4.12)  to  (4. 15)  is 
asymptotically  risk  efficient.  Before  we  state  the  theorem,  we  give  a  list  of  sufficient 
conditions  needed  to  prove  the  theorem.  The  proof  is  given  in  Chapter  6. 

Let 

Z+  denote  the  positive  integers, 


(4.16) 


e  =  min(^,p0  -  ^), 


(4.17) 


and 


D(t) 


0 


T 


The  sufficient  conditions  are: 

(1)  EX]2<oo,i  =  l,2,  j  e  Z+- 

(2)  There  exists  an  e  6  (0,  e) 


(4.18) 


(4.19) 
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t{  [  Tk(t)dP-  {^)k}  =  0(1),  asi  — >  oo;  for  k  =  1,2, 3, 4.  (4.20) 

J[T{t)eDM]  P 

Theorem  4.3  (Risk  Efficiency):  Suppose  populations  1  and  2  satisfy  conditions  (4.19) 
to  (4.20).  Let  .5  <  a  <  1  —  and  assume  that  the  initial  sample  size  mo  =  ba.  For 
Ni  defined  in  (4.15)  and  ho  =  min{ 2  —  a,  1  +  a}  we  have 

E(Wq  -  Wq)2  =  V°(b)  +  O(^),  asb  oo,  (4.21) 

where  V°(b)  is  defined  as  in  (4.1 1). 

The  two  stage  procedure  outlined  in  (4.12)  to  (4.15)  is  valid  for  any  service  time 
distribution.  One  interesting  special  case  of  the  M/G/l  queue  is  the  M/D/1  queue  where 
the  service  times  are  deterministic.  Our  two  stage  procedure  can  handle  this  special  case. 
The  values  of  a2,  k3,  and  k4  are  all  zero  when  the  service  times  are  deterministic.  The 
result  of  minimizing  the  first  order  approximation  of  the  MSE  is  that  riff1  =  0  indicating 
the  entire  budget  should  be  spent  sampling  the  interarrival  times.  In  practice,  using  the 
two  stage  procedure,  the  experimenter  would  have  already  spent  mo  of  his  budget 
sampling  from  the  service  time  distribution  to  get  estimates  of  j3,  r,  a2,  kz,  and  k4.  The 
remaining  budget  would  then  be  allocated  to  the  interarrival  time  distribution  based  on 
the  result  that  n*  =  0.  The  practical  implications  of  using  the  two  stage  sampling 
scheme  under  various'  service  time  distributions  is  explored  further  in  Chapter  5  of  this 
dissertation. 


CHAPTER  5 


EMPIRICAL  RESULTS  FOR  THE  QUEUEING  MODEL 

We  designed  and  ran  simulations  to  demonstrate  the  properties  presented  in  Theorem 
4.3,  and  to  evaluate  the  performance  of  the  M/G/l  model  under  different  service  time 
distributions.  Of  particular  interest  was  the  performance  of  the  M/G/l  sampling  scheme 
when  the  service  time  distribution  is  exponential.  The  effects  of  changing  parameters 
such  as  p,  traffic  intensity,  cv,  coefficient  of  variation,  and  a,  used  to  compute  the  initial 
sample  size,  were  also  investigated.  The  simulations  used  the  same  hardware  and 
software  as  those  presented  in  Chapter  3.  10,000  replications  were  performed  for  each 
design  point. 

The  performance  of  the  M/G/l  sampling  scheme  presented  in  (4.12)  to  (4.15)  was 
evaluated  in  light  of  the  performance  of  the  M/M/1  sampling  procedure  developed  by 
Zheng,  Seila  and  Sriram  (1997b).  Table  5.1  is  a  reproduction  of  the  some  of  the  data  in 
Table  1  of  Zheng,  Seila  and  Sriram  (1997b).  It  contains  values  for  (V two  ~  Viow/Vlow)  as 
computed  using  their  sampling  scheme  for  the  M/M/1  queue  described  in  Chapter  2  of 
this  dissertation. 
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Table  5.1  Asymptotic  Values  of  Q V two  ~  Viow/Viow)  in  the  M/M/1  queue 


b 

p  =  0.50 

p  =  0.60 

p  =  0.70 

p  =  0.80 

p  =  0.85 

100 

3.084 

4.143 

2.411 

-0.247 

-0.801 

200 

0.799 

2.193 

2.232 

0.079 

-0.700 

400 

0.256 

0.522 

1.312 

6.403 

-0.547 

800 

0.158 

0.244 

0.547 

0.670 

-0.325 

1000 

0.061 

0.224 

0.41 1 

0.641 

-0.256 

2000 

0.053 

0.105 

0.186 

0.456 

0.003 

4000 

0.031 

0.046 

0.078 

0.222 

0.121 

8000 

0.011 

0.013 

0.048 

0.073 

0.188 

The  results  in  Table  5.2  were  achieved  using  the  M/G/l  sampling  scheme  presented  in 
(4.12)  to  (4.15)  and  assuming  an  exponential  service  time  distribution  with  values  of  p  in 
(0.5, 0.7)  and  p0  =  0.9.  95%  upper  and  lower  confidence  bounds  are  also  provided. 
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Table  5.2  Asymptotic  Values  of  (! Vtwo  ~  Viow/Viow)  in  the  M/G/l  queue 


b 

LCB 

LCB 

O 

II 

UCB  | 

100 

2.522 

2.994 

3.912 

4.291 

2.243 

200 

Buna 

0.808 

1.016 

— rlia 

2.018 

2.318 

2.276 

ESI 

Buma 

0.242 

0.298 

0.512 

wivsrm 

1.315 

1.470 

mu 

gum 

0.182 

0.228 

0.174 

Eaa 

0.564 

0.656 

1  1000 

0.088 

0.125 

0.135 

0.231 

0.287 

0.364 

0.441 

0.054 

0.090 

Kim 

mi 

0.096 

0.133 

0.124 

EEEEI 

EMI 

-0.021 

0.009 

0.039 

0.070 

0.047 

EBB 

|  8000 

-0.017 

0.012 

0.041 

-0.031 

0.026 

0.061 

0.095  | 

b 

LCB 

II 

oo 

O 

UCB 

LCB 

ID 

GO 

II 

Cl 

UCB  | 

100 

-0.295 

-0.275 

-0.797 

-0.805 

11 

200 

0.011 

0.041 

0.072 

-0.706 

-0.701 

0.355 

0.401 

0.447 

gum 

0.698 

1000 

0.612 

0.680 

2000 

-0.010 

0.016 

4000 

0.136 

0.173 

8000 

0.067 

0.104 

0.142 

0.131 

0.172 

0.213 

The  results  in  Table  5.2  demonstrate  the  convergence  of  the  estimated  MSE,  Vtwo  to 
the  first  order  term  for  the  MSE,  Viow  given  by  (4.1 1).  From  Table  5.2,  one  can  see  that 
as  the  total  sampling  budget,  b,  becomes  large,  Vtwo  ~  Viow  tends  to  0.  This  empirical 
data  supports  the  theoretical  results  claimed  in  Theorem  4.3.  One  may  observe  from 
Table  5.2  that  the  convergence  of  Vtwo  -  Viow  to  0  is  much  slower  as  the  traffic  intensity 
increases.  Comparing  the  results  from  Table  5.2  with  that  of  Table  5.1,  one  can  see  the 
performance  of  the  M/G/l  sampling  procedure  is  remarkably  similar  to  the  results  using  a 
scheme  based  on  the  M/M/1  queue.  There  appears  to  be  no  degradation  in  performance 
by  using  the  more  general  model.  Figure  5.1  displays  this  information  graphically  for  the 
traffic  intensity  p  =  0.7. 
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Exponential  Service  Times 

rho  =  0.7 


100  200  400  800  1000  2000  4000  8000 

b  =  sanpling  budget 


■  MW1 
□  IWG/1 


Figure  5.1 

When  the  service  times  are  exponential  the  coefficient  of  variation  of  the  service  time 
distribution  is  1.0.  We  conducted  simulations  using  different  service  time  distributions  to 
study  the  effect  of  changing  the  coefficient  of  variation  of  the  service  times.  We  first 
looked  at  cv  values  greater  than  one.  For  these  simulations  we  used  a  hyperexponential 
service  time  distribution.  The  hyperexponential  distribution  with  three  parameters,  p,  T\, 
and  r2  can  be  thought  of  as  being  created  from  two  exponential  distributions  with  means 
r\  and  72  respectively.  To  sample  from  a  hyperexponential  distribution,  one  samples 
from  the  first  exponential  distribution  with  probability  p  or  the  second  exponential 
distribution  with  probability  (1  —  p). 

Table  5.3  provides  values  of  (Vtwo  -  Viow/Viow)  and  its  95%  confidence  intervals  for 
our  two  stage  procedure  with  various  traffic  intensities  when  the  coefficient  of  variation  is 
2.0.  Table  5.4  contains  the  same  information  when  the  coefficient  of  variation  is  5.0. 
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Table  5.4  Asymptotic  Values  of  (V two  -  Viow/Viow),  CV  =  5.0 


b 

LCB 

p  =  .50 

UCB 

LCB 

li 

o 

UCB 

100 

0.977 

1.283 

1.589 

2.418 

2.587 

2.757 

200 

0.327 

0.392 

0.456 

1.801 

1.990 

2.179 

400 

0.081 

0.123 

0.164 

0.645 

0.754 

0.864 

800 

0.032 

0.067 

0.101 

0.219 

0.281 

0.343 

1000 

0.037 

0.070 

0.103 

0.188 

0.240 

0.292 

2000 

-0.003 

0.026 

0.056 

0.048 

6.085 

0.121 

4000 

0.002 

0.031 

0.061 

0.019 

0.051 

0.084 

8000 

-0.019 

0.009 

0.038 

-0.035 

-0.006 

0.024 

b 

LCB 

O 

QO 

II 

UCB 

LCB 

p  =  .90 

UCB 

100 

-0.015 

0.009 

0.033 

-0.943 

-0.941 

-0.939 

200 

0.329 

0.369 

0.409 

-0.916 

-0.913 

-0.911 

400 

0.578 

0.637 

0.696 

-0.882 

-0.879 

-0.875 

800 

0.558 

0.630 

0.701 

-0.839 

-0.833 

-0.828 

1000 

0.522 

0.597 

0.671 

-0.821 

-0.815 

-0.809 

2000 

0.250 

0.305 

0.361 

-0.781 

-0.773 

-0.765 

4000 

0.098 

0.140 

0.181 

-0.732 

-0.722 

-0.712 

8000 

0.043 

0.077 

0.111 

-0.693 

-0.681 

-0.668 

We  also  studied  cv  values  less  than  one.  For  these  simulations  the  service  times  were 
distributed  uniformly.  Table  5.5  provides  values  of  (V two  —  Viow/Viow)  and  its  95% 
confidence  intervals  for  our  two  stage  procedure  with  various  traffic  intensities  when  the 
coefficient  of  variation  is  0.01 .  Table  5.6  contains  the  same  information  when  the 


coefficient  of  variation  is  0.5. 


imR] 


s 


0.370 


0.134 


0.052 


0.062 


0.033 


0.831 


0.425 


0.177 


0.094 


0.063 


UCB 

LCB 

0.933 

2.479 

0.479 

1.263 

0.311 

liEKtl 

0.213 

0.204 

0.167 

0.115 

0.100 

0.125 

0.098 

2.703 


1.442 


2.927 


1.622 


0.586 

0.328 


0.252 

0.171 


b 

LCB 

UCB 

LCB 

p  =  .90 

UCB 

100 

0.544 

0.588 

0.613 

-0.902 

-6.899 

-0.896 

200 

0.861 

0.928 

0.995 

-0.868 

-0.864 

-0.859 

860 

447 


0.361 


0.191 


10 

57 


0.945 


0.517 


0.421 


0.237 


0.148 


0.089 


48 

28 


0.186 


0.120 


54 

18 


-0.675 


-0.618 


-0.807 

-0.751 


73 

69 


50 

87 


Table  5.6  Asymptotic  Values  of  ( Vtwo  -  Vlow/Vlow),  CV  =  0.5 


b 

LCB 

O 

II 

UCB 

LCB 

p-  .70 

UCB 

100 

0.977 

1.283 

1.589 

2.418 

2.587 

2.757 

200 

0.327 

0.392 

0.456 

1.801 

1.990 

2.179 

400 

0.081 

0.123 

0.164 

0.645 

0.754 

0.864 

800 

0.032 

0.067 

0.101 

0.219 

0.281 

0.343 

1000 

0.037 

0.070 

0.103 

0.188 

0.240 

0.292 

2000 

-0.003 

0.026 

0.056 

0.048 

0.085 

0.121 

4000 

0.002 

0.031 

0.061 

0.019 

0.051 

0.084 

8000 

-0.019 

0.009 

0.038 

-0.035 

-0.006 

0.024 

b 

LCB 

O 

OO 

II 

UCB 

LCB 

II 

0> 

o 

UCB 

100 

-0.015 

0.009 

0.033 

-0.943 

-0.941 

-0.939 

200 

0.329 

0.369 

0.409 

-0.916 

-0.913 

-0.911 

400 

0.578 

0.637 

0.696 

-0.882 

-0.879 

-0.875 

800 

0.558 

0.630 

0.701 

-0.839 

-0.833 

-0.828 

1000 

0.522 

0.597 

0.671 

-0.821 

-0.815 

-0.809 

2000 

0.250 

0.305 

0.361 

-0.781 

-6.773 

-0.765 

4000 

0.098 

0.140 

0.181 

-0.732 

-0.722 

-0.712 

8000 

0.043 

0.077 

0.111 

-0.693 

-0.68 1 

-0.668 

Figures  5.2  and  5.3  display  the  information  in  Tables  5.3  through  5.6  graphically. 
Figure  5.2  shows  the  effect  of  traffic  intensity  on  the  convergence  of  the  two  stage 
procedure  when  CV  =  0.5. 
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Figure  5.2 

Figure  5.2  shows  that  the  MSE  of  the  two  stage  estimator  converges  to  the  theoretical 
minimum  MSE  much  slower  for  higher  values  of  traffic  intensity. 

Figure  5.3  shows  the  effect  of  coefficient  of  variation  in  the  service  times  on  the 
convergence  of  the  two  stage  estimator  when  p  =  0.7. 
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100  200  400  800  1000  2000  4000  8000 

b  =  sampling  budget 


•  CV  =  0.01 
CV  =  0.5 
CV  =  2.0 


♦  CV  =  5.0 


Figure  5.3 

Figure  5.3  shows  that  the  MSE  of  the  two  stage  estimator  converges  to  the  theoretical 
minimum  MSE  much  slower  for  higher  values  of  coefficient  of  variation  in  the  service 
time  distribution. 

We  also  conducted  simulations  to  determine  the  effect  of  a  and  therefore  initial 
sample  size  on  the  performance  of  the  sampling  scheme.  For  these  simulations  we  chose 
three  different  values  of  a:  0.5,  (0.5  +  1  -  ln(2)/ln(6))/2,  and  (1  -  ln(2)/ln(6)).  These 
values  of  a  are  referred  to  as  low,  middle,  and  high  respectively.  The  "high"  value  of  a 
can  actually  be  thought  of  as  a  one  stage  procedure  since  mo  =  ba  =  &(i-in(2)/in(&))  _  | 
The  service  time  distribution  for  these  simulations  was  exponential.  The  traffic  intensity 
was  0.5.  Table  5.7  provides  values  for  (V two  -  Viow/Viow)  for  various  values  of  b.  Figure 
5.4  displays  this  information  graphically. 


Table  5.7  Estimated  {V two  ~  Viow/Vi 


M/G/l  Queue  with  Ex] 


b 

a 

95%  Lbd 

low 

4.042 

100 

middle 

2.326 

high 

2.065 

low 

1.314 

200 

middle 

0.577 

high 

0.591 

low 

0.286 

400 

middle 

high 

0.093 

800 

0.072 

high 

0.149 

low 

0.116 

1000 

middle 

0.050 

high 

low 

2000 

middle 

0.032 

high 

0.072 

low 

-0.033 

4000 

middle 

high 
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B  low 
middle 
high 


Figure  5.4 

Table  5.7  and  Figure  5.4  indicate  there  is  no  clear  preference  among  the  different  values 
for  a  for  all  budget  sizes,  especially  beyond  b  =  400.  These  results  do  indicate  a  =  0.5 
is  likely  not  a  good  choice  for  initial  sample  size  at  least  for  small  values  of  b.  It  is 
interesting  that  the  two  stage  estimator  does  not  appear  to  outperform  the  one  stage 
estimator  represented  by  the  "high"  a  value.  This  is  because  the  true  minimum  of  the 
first  order  approximation  of  the  MSE  in  this  case  is  0.0266491 1.  The  MSE  of  the  one 
stage  estimator  is  0.0280.  This  small  difference  could  not  be  detected  because  of  the 
sampling  error  experienced. 

All  of  the  above  simulations  demonstrate  that  the  MSE  of  the  two  stage  estimator 
converges  to  the  minimum  MSE  at  reasonable  final  sample  sizes  under  various  service 
time  distributions,  traffic  intensities,  and  initial  sample  sizes. 


CHAPTER  6 


PROOFS  OF  THEOREMS 


6.1  Proof  of  Theorem  3.1 

Let  Ti  denote  Tfimfi,  i  =  1, 2, 3, 4, 5  defined  in  (3.24)  through  (3.28).  Define  the 
following  sets  for  nf ,nf ,  and  nf  defined  in  (3.8). 

Ai  =  [nf  >  mo,  nf  >  ra0,  nf  >  m0],  A2  =  [nf  >  m0,  nf  >  m0,  nf  <  m0], 

^3  =  [nf  <  mo,  nf  >  mo,  nf  >  m0],  A4  =  [nf  >  m0,  nf  <  m0,  nf  >  m0], 

A5  =  [nf  >  mo,  nf  <  m0,  nf  <  m0],  A6  =  [nf  <  m0,  nf  >  m0,  nf  <  m0], 

Aj  —  [nf  <  mo,  nf  <  mo,  nf  >  m0]. 

Clearly,  Ax,  A2, ...,  A7  are  disjoint  and  the  union  of  these  sets  is  the  sample  space.  Our 
proof  of  Theorem  3.1  depends  on  the  following  lemmas. 

Lemma  6.1.1:  Assume  conditions  (3.29)  to  (3.31).  For  ft  in  condition  (3.30),  let 
^4  <  a  <  1  —  and  m0  —  ba.  Then,  for  sets  Ax  through  A7  defined  above  we  have, 
as  b  — >  oo, 


P(Ai)  =  0(-^),l  =  2, 3,. ..,7, 


(6-1) 


(6.2) 


£(T2  +  Ta)  =  S  +  +  0(^> + 


(6.3) 


+  =  S  +  +  °( +  )• 


(6.4) 
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E[(ti)'/4,]  =  0(i). 

(6.5) 

B[(T2  +  T3)‘/X|1  =  0(1), 

(6.6) 

B[(T4  +  T5)'/a]=0(  1),  for  1  =  2,3, 4. 

(6.7) 

Proof.  For  (6.1),  we  only  prove  the  case  l  =  2  as  the  rest  of  the  cases  can  be  proved  in  a 
similar  way.  Now,  for  Z)j(e)  defined  in  (3.14)  to  (3.18)  we  have 


P(A 2)  <  P(n*  <  mo) 

_  p^___bXAS1 


=  P 

=p(f 


Xi(S2~hS3)+Si  (X2+X3) 

(S2+S3)+S1(X2+X3) 
bX:S3 


\  1 
(* 


+ 


XA  +  5i(X2+X3) 


=  P 


21  _l 
+ 


bXiS 3 

> 


<  mo) 

-  f) 

>i) 


6-6° 

6“ 


bXxSz 

Xi  s3 

=  p(t4  >  *$■)  +  p(ts  >  ig£) 


=  P(t4-°i 

V  °3 

<  P 


> 


> 

*  03  — 


b~bQ 

2ba 

b 


b-ba\ 
2b°  ) 


b^_  _  £2 
26°  cr3 


g)+^(r. 

+  P 


a\  (M2+M3) 


> 


T. fc- 


Ml^ 

(^2+Ma) 
Ml  0-3 


=  P[T4£P4(e)]  +  P[r5£P>5(e)] 


6-6°  _  QjdM2+M3) 

26°  Mi  / 

>  b-ba CTi  (M2+M3)  ^ 

—  26°  Mi  ^3  y 


where  the  next  to  last  step  follows  since  for  a  <  1,  — *  00  as  b  — *  00,  and  the  last 

step  follows  from  condition  (3.30)  and  since  mo  =  ba.  Hence  (6.1)  follows. 


As  for  (6.2)  through  (6.7),  let  @  and  eo  satisfy  conditions  (3.30)  and  (3.31) 
respectively,  with 

n  <r  fn  s  min/ £3  1  gi(M2+M3)  a  .  Mmz+Ms)  1  „nH 

U  <  e0  <  min  |  ^  ^  )  and 


B  —  {T\  G  Di(eo),T2  G  D2(eo),Tz  G  Dz(eo),T4  G  P4(eo))P5  £  Ps(eo)}- 


Since 


Ai  =  {n*  >  P'yn*  >  ba,n *  >  ba} 

{b  —  2 ba  >  n*  >  ba,  b  —  2£>a  >  n*  >  ba,b  —  2ba  >  n*  >  ba} 


_  /  26Q  s'  rV  ^  b-ba  2ba  ^  T*  \  T*  s' 

1  6-26°  ^  1  ^  6Q  5  6— 26Q  <  ^  2  +  3  < 


6-6*  2ba 


ba  ’  6-26“ 


<  Ta  +  T5  < 


6-frQ  1 
6*  J’ 


and  for  a  <  1,  537^  — >  0  and  -*  00,  as  6  — ►  00,  it  follows  for  large  6,  that 
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R  C  f  Ml  (^2+03)  ^  rp  ^  Vlfa+Vz)  1  r  1 

B~  t^+w)  _e°  <^1  <  a^+w)  +  £0) 

n  /  a3  I  ^lO^+Ms)  ^  np  \  rp  ^  Gz  \  CTl(^2+M3)  ,  r  1 

n\a2+  m2  “£o  <42  +  i3  <  T1+~i^~  +  €°J 

p|  /  ff2  I  ^lO^+W)  ^  rp  <  rp  <72  ,  <7l(/l2+M3)  t  1 

n  l  a3  +  - e0  <  u  +  H  <  ~z  +  ^3  +  Co) 

CAi. 

Hence, 


o  <  b[(t,)'/a,]  -  BKTO'/fi] 

<(0P(Al-B), 

=  0{ ;,(a^+Q<-i) )>  ^  =  1)2, 3, 4,  (6.8) 

as  b  — >  oo,  where  the  last  step  follows  from  condition  (3.30)  and  since  mo  =  ba.  Note 
that  this  expression  is  also  true  for  E[{T2  4-  T2)lIAl]  —  E[(T2  +  T3)lIB ]  and  for 
E[(TA  +  T5)lIAl]-E[(T4+T5)lIB). 

Next  we  show  that  conditions  (3.30)  and  (3.31)  imply  that  for  each  z  =  1,2, 3, 4, 5 

[  T\(t)dP  =  R\  +  0()-),  ast-+  oo,  (6.9) 


for  l  =  —  3,  —  2,  —1,1,2, 3.  To  this  end,  we  only  show  (6.9)  for  the  case  l  —  3  as 
similar  arguments  yield  the  result  for  the  cases  l  =  —  3,  —  2,  —  1, 1, 2.  Note  that  for 
each  z  =  1,2, 3, 4, 5 

Rf{l  -  (3/4)fl?P?  -  fl?]}  <  Tf(t)  <  Rf{  1  +  (3/4 )Rf[Tf  -  R})}, 
if  T{(t)  €  Di(e).  Hence,  by  conditions  (3.30)  and  (3.31) 

I[Ti(t)eDi(i)]Ti(t)dP  =  Ri  +  °(l)>  as  t  ->  oo. 

Now,  from  (6.9)  and  (3.30)  we  have 


Emm  =  ^PUir.wcAwiin^w «  am 

*¥  1 


=  (5jfeS7  +  °(p))(1-0(^)4 

_  ^l(<72+C73)  ,  P)/J_\ 

<7l(^2+M3)  ' 


(6.10) 
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Similarly, 

E[(T2  +  T3)Ib]  =  (E[T2I[T2{t)eD2(e)}]  +  £'[^3Jr[r3(t)e£)3(e)]])n  P(Ti  e  Di(e ) 

#  2,3 

=  (S  +  aiSf1+o(F))(1-o(w)3 

_  £3  I  °1  fe+Ma)  |  O/ J_\ 

”  a2  ^  ^a2 

and 

«[p4  +  T5)Ib]  =  (B[T4/BWeD<(,j]  +  BK/R(06O>w|])n  P(T,  €  Di(«) 

i^4,5 

=  (g  +  +  °(i>)  (1 "  °(r)3 

_  £2  1  £1  (^2+^3)  1  /n/_l\ 

“  ff3  +  + 

Now,  (6.2)  follows  from  (6.8)  and  (6.10).  Assertions  (6.3)  and  (6.4)  follow  similarly. 

The  results  in  (6.5)  to  (6.7)  can  be  obtained  using  (6.8),  (6.9)  and  arguments  as  in  (6.10). 

□ 

Lemma  6.1.2:  Suppose  the  assumptions  of  Lemma  6.1.1  hold.  Let 

h\  =  m('n(a(l  +  (3/2), 2  —  a)  for  a  and  (3  in  Lemma  6.1.1.  Then  for  Ni,  i  —  1, 2, 3 

defined  in  (3.12)  we  have,  as  b  — >  00, 

E(Xu,,-lti?=^E^+0(L)  (6.11) 


E(%2 Ni  +  X3 N,  “  0*2  +  M3))2  —  j^)  (6.12) 

2  3 

and 

E(Xiw1  —  pi)p(X2N2  +  Xzn3  ~  (h2  +  hz)Y  —  (6.13) 

for  p,  q  =  1,2. 

Proof. 

To  prove  (6.1 1),  note  that  since  the  three  populations  are  independent  and  Ni(>  mo)  is 
an  integer-valued  r.v.  measurable  w.r.t.  (Xn,  X12,  ...,  X\mo, 

X2J,  X22) ...,  X2mo,X31,X32) ...,  X3mo),  by  direct  calculations  it  follows  that 
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mo  , 

E(^u-/*i)P 

E(XWl  -  n{)2  =  ^E(^)  +  E11  N2 - m0a\E{jjs) 

—  aiE(jqk)  +  Rb,  '  (6.14) 

where 

m0 

Rb  =  <tJE[(JV*  -  JVO/WJV*)]  +  £[E(*lj  -  /n))X2  -  maafw-lr).  (6.15) 

i=i 

We  will  show  that  for  /z-x  defined  in  the  Lemma 

E\Rb\  =  0(-^~)  asb oo.  (6.16) 


Since  N\  =  [N*],  0  <  (iV*  —  Ni)/(NiN*)  <  2 (N*)  2 .  Moreover,  it  suffices  to 
show  that  the  last  two  terms  on  the  right  hand  side  of  (6.15)  are  O(prf)  with  Ni  replaced 
by  N*.  Now,  for  N* in  (3.9),  n*  in  (3.8)  and  sets  Ai,...,A-?  defined  earlier 


E(l/N*)2  =  b~2E[(  1  +  Ttfh,)  +  El(l/N*)%],  (6,17) 

where,  since  N*  >  mo  =  ba, 

E[(l/N*)2IA<]  <  m^2P(A2  U  ...  U  At) 

=  as  6  ->  oo,  (6.18) 

by  (6.1).  Moreover,  by  (6.2)  and  (6.5) 

-  b~2E[{  1  +  Ti)2 1 Al]  =  0(b~2)  as  6  — >  oo.  (6.19) 

Hence,  from  (6.17)  to  (6.19)  and  since  a  >  ^4 

£(1/AT*)2  =  0(--^})  as  6  -4  00.  (6.20) 

From  this  and  since  mo  =  ba,  the  third  term  on  the  right  hand  side  of  (6.15) 

moalE{^)  =  0(— ^y)  as  b  -+  00.  (6.21) 


49 


Finally,  as  in  (6.17) 


m0  m0 

£E(*U  -  W)1  -Wf2  =  b-2El[j:(,Xij  ~  m)]2(l  +  Ti)2^,] 

j= i  i=i 

mo  0 

+  m{E(Jfu-Ml)]SArr2}^f].  (6.22) 

j=l 


Since  iV*  >m0  =  ba  and  -EE(-Xij  -  p i)]4  =  O(toq),  by  the  Cauchy-Schwarz 

j= i 

inequality  and  (6.1) 


m0  i 

-  «)]2«r2}/A.]  =  0(7-0^)  (6.23) 

3= 1 


as  b  — >  co.  A  similar  argument  using  (6.5)  yields 


777o  -| 

6-2B[[£(Xu  -  ^1)]2(1  +  xi)2/^,]  =  O(p^)  (6.24) 

i=i 


as  6  — >  oo.  The  assertion  (6.16)  now  follows  from  (6.17)  to  (6.24).  Hence  the  assertion 

(6.11). 

For  (6.12)  note  that 

E(X2N2  +  Xznz  -  (M 2  +  M3))2 
=  E{X.2N2  ~~  f12)2_+  E{Xznz  —  M3)2 
-  2(AT2iv2  -  ^2){Xznz  -  M3) 

=  o|£^+^£^  +  0(it)  (6.25) 

where  the  last  statement  follows  from  the  same  arguments  used  to  prove  assertion  (6.1 1). 
For  (6.13)  with  p  =  1,  q  =  1,  first  note  that 
E(XiNi  ~  Ml)(^2iV2  +  ^3JV3  —  (M2  +_M3)) 

=  E(Xin1  -  Mi )(X2n2  -  M2)  +  E(X i^!  -  p\){Xw3  -  M3)  (6.26) 


so  it  suffices  to  show 
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£(*!*,  -  n)(Xm  -  /n)  =  0(A) 


(6.27) 


As  in  (6.14),  we  have 


_  _  mo 

E(Xm  -  to)(X2N2  ~  to)  =  £[£(*ij  -  to )]E[tl(X2j  -  to)]/(Ni N2).  (6.28) 

j=i  j= l 

Once  again,  it  suffices  to  show  that  the  right  hand  side  of  (6.28)  is  O(^)  with  Ni  and  N2 
replaced  by  Nf  and  AT*,  respectively.  As  in  (6.22),  the  right  hand  side  of  (6.28)  is 

mo  mo 

=  6-2£[E(v„  -  m)]E[[E(^  -  w)](i  +  r,)(i  +  %  +  t3)iAi] 

j=  1  j=  1 

mo  mo 

+  £{E(*U  -  to)]E[\Z(Xv  -  to )}/(N?N*)}IAl)  (6.29) 

j=l  j= 1 

=  O(prs))  +  0(^72))  asb-*oo, 

using  arguments  similar  to  (6.23)  and  (6.24).  Hence  the  assertion  (6.27),  and  hence  the 
required  result  in  (6.13)  with  p  =  q  =  1.  □ 

Proof  of  Theorem  3.1. 

Write  : 

XiNi (X2n2  +  Xsn3)  ~  to(to  +  to)  -  {Km  -  to)(to  +  to)  (6.30) 

+  (X-2n2  +  X3n3j-  (to  +_to))to 

+  (Xi^  -  to)(X2N2  +  X2Ni  -  (to  +  to)) 


Then  by  Lemma  6.1.2,  we  have 


E(X\n1  (X2n2  +  Xzn3)  —  to(to  +  to))2  —  (6.31) 

(to  +  to?c\E-X  +  fajE-X  +  pfajE-X  +  O(^), 


as  b  — >  oo.  By  the  definition  of  A)* in  (3.9)  and  arguments  as  in  (6.17)  we  have  by 
Lemma  6.1.1  that 


^  N* 


iEKl+T^  +  EK-^)^] 

4(i+sfeai)+oti)+o( 


6d+“) 


)• 


Similarly, 


E± 


N* 


=  i(i  +  Zl 
b\l  +  a2 


+ 


£i {P2+£z)\ 


)  +  O(j^YWi)  + 


(6.32) 


(6.33) 


and 


Ml  02 
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Elt?  =  K1  +  S  +  +  °(M)  +  O(pfel)-  (6-34) 

Substituting  these  expressions  into  (6.30),  we  have  (3.32).  Hence  the  theorem.  □ 


6.2  Proofs  of  Chapter  4  Theorems 

Proof  of  Theorem  4.1. 

For  (4.2),  note  that 

e\w„\  =  eym . ^[%[|#.,|  y,,y2,...,rj, 


and  that 


Ex[\Wiq\YuY2,. 

Y  1  =  711 

-  r(n,)/3".J 

roc  i 

1  rr — ~\  dx  =  +  oo 

o  2  Jcc  y  | 

Therefore, 

E\Wlq 

I  =  +  oo. 

(4.3)  now  follows  since 


=  +  oo  =>  Ew\q  —  +  oo.  □ 

Proof  of  Theorem  4.2. 

For  (4.6),  let  Di(eo)  =  [f3  —  eo,  (3  +  eo],  ^2(^0 )  =_[r  —  e0i r  +  eo], 

B  =  [X\ni  G  jDi(eo),  X2„2  G  ^(eo)],  and  zl  =  ^4^  <  p0  •  Note  that  there  exists  an 
eo  £  (0,  such  that  B  C  A.  Observe  that  for  any  u  >  1,  and  any  e  G  (0, 

P[x,„,^0,(£)]  =0(i),andP[X2w^D2({)]  =0(4).  (635) 

'll  n2 

We  have: 

-  Wq)2  =  E(Wq  -  WqfIA  +  E(Wq  -  Wq)2IAc  (6.36) 

First,  note  the  second  term  in  (6.36) 

E(w,  -  W,fJA,  =  0(  J ff)  +  0( -ir,) 

n\  n2 

by  the  Cauchy-Schwarz  inequality. 

Now  note  that  the  first  term  in  (6.36)  is 
E(W,  -  W,flA  -  E(W,  -  W,)-IB  +  E(W,  -  W,fIB 
=  E(W,  -  W,fIA.B  +  E(W,  -  W,)% 

<  ( E(W ,  -  W,)*)*(E(IA-b))* ‘  +  E(W,  -  W,)Hb 
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=  (E(Wq  -  Wq)A)*(P(A  -  B))v'  +  E(Wq  -  Wq)2IB 
=  0(l)(P[Xln,  i  Di{e)  U  X2„2  i  D2(e)]f  +  E(Wq  -  Wq)2IB 
<  0(-p)  +  O(^)  +  £(Wg  -  WqflB, 

for  any  u>  >  1. 


(6.37) 


Thus  by  expanding  for  the  function  Wq  in  a  Taylor  series  on  B,  and  (6.37),  we  have 
(4.6).  ' 

For  (4.5),  note  that  similar  to  (6.37),  we  have,  for  any  uj  >  1, 


E(W„  -  Wt)IA  =  E(W,  -  W,)IB  +  O(jJi)  +  0(^3) 

=  0(i)  +  0(i). 

(4.5)  follows  from  (6.35)  and  (6.38).  □ 


(6.38) 


Our  proof  of  Theorem  4.3  depends  on  the  following  three  lemmas.  Let  T(n),  e,  and  D{e) 
be  defined  as  in  (4.16)  to  (4.18). 

Lemma  6.2.1:  If  X\\ ,  X12, ...  X\n  are  i.i.d.  exponential  random  variables  with  mean  (3 
and  X21 ,  X22,  ■■■  X2 n  are  i.i.d.  random  variables  with  mean  r  and  finite  variance  o2 
such  that  (4.19)  holds,  then 

For  any  u>  >  1  and  any  fixed  e  €  (0,  e), 


n“P[T(n)  i  D(i)\  =  0(1) 

(6.39) 

n"P[S2(n)  i  03(£)]  =  0(1) 

(6.40) 

For  Ni  defined  as  in  (4.15)  with  mo  =  ba,  we  have: 

$ 


E^-  = 


+  0^ 


N\  b(3(r2  +  cr2)(A  —  /3(r2  +  o2))  &(1+a) 


(6.41) 


1  _  $ 

N2  6A(A-/5(r2  +  cr2)) 


+  0 


(6.42) 


where  A  is  given  in  (4.9)  and  $  is  given  in  (4.10). 

Proof.  (6.39)  and  (6.40)  are  proved  in  Zheng,  Seila  and  Sriram  (1995).  As  for  (6.41)  it 
suffices  to  show  this  result  with  Ni  replaced  by  N* .  Then  using  arguments  as  in  (6.17) 
we  have: 
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^  N*  ^i/5(r2+<72)(A-3(T2+<72)) 


[mo<  n*  <  6-m0] 


+  0 


1 

b<*(  1+0) 


_  _ £ _ 

&/?(t2+0'2)(A-/3(t2+C72)) 


+  °(w)’ 


(6.43) 


by  (4.20)  where  hi  —  min(o:(l  +  (3 ),  1  +  a).  Hence  we  have  (6.41)  noting  that  u  is 
arbitrary.  (6.42)  is  proved  similarly.  □ 

Lemma  6.2.2:  IfNi  and  N2  are  defined  as  in  (4.15),  a  €  (0.5, 1  -  |^|),  (n°pt,  n^)  are 
defined  as  in  (4.8)  and  do  =  min(  1  +  a,  2  —  a),  t/ien: 


£(*,,,, -ffl2  =  /?-L+ 0(2.)  (6.44) 

£(^-r)2  =  a2-2,+0(i)  (6.45) 

n2 

E(s2(N2)  -  a2)2  =  (k4  -  „4)_L  +  0(2-)  (6.46) 

«&,-/!)'(Il»,-r)'(j,(J»1)-»!)r=0(2)  (6,47) 

for  p,q,r  —  0,  1,  2,  such  that  (p,  q,  r )  ^  (2, 0, 0),  (0, 2, 0),  (0, 0, 2),  and  (0, 1, 1). 

E(X2fl,  -  t)(s2(N2)  -  a2)  =  k3-2,  +  0(2-)  (6.48) 

Proof.  For  (6.44),  using  similar  arguments  to  prove  Lemma  6.1.2  we  have 

E(Xm  -  P)2  =  H2E(^)  +  0(2-)  (6.49) 

where  di  =  min(o:(l  +  o>/2),  2  —  a).  Now  using  (6.43)  we  have 

£(^)  =  f;(^)+0(w^))  <6-50) 

Hence  (6.44).  (6.45),  (6.46)  and  (6.48)  are  shown  by  similar  arguments.  (6.47)  is  shown 
using  similar  arguments  used  to  prove  (6.13).  □ 


Lemma  6.2.3:  For  any  u  >  1  and  0  <  t  <  1,  the  following  hold 


(6.53) 


P[XOT!>p„X1W,]=0(^;) 


P[s2(iV2)  ?!  D3(0]  =  0(jL)  (6.54) 

Proof.  (6.51)  to  (6.53)  are  proved  in  Lemma  3  of  Zheng,  Seila,  and  Sriram  (1997b). 
(6.54)  is  proved  in  a  similar  manner.  □ 

Proof  of  Theorem  4.3. 

Let  A  =  [X2n2  <  Po-^iJVj ]•  By  the  definition  of  Wq,  we  have 


E(Wq  -  Wq)2  =  E 


2(*iw1-*2W2) 


2 


o~2+r2 
2(f)— r) 


Ia 


+  E 


PoXzN- 


2(1— Po) 


2 


*A«. 


(6.55) 


We  first  consider  the  second  term  in  (6.55). 


_  2(1— po) 

Po  ® 


2 

/a- 


n2 


-  (t  +  T)  +c 
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s  2(5fe)2{£(^  +  -  (T+rf1*  + 

=  °(bir1),  for  any  u\  >  1,  by  (6.53). 

Let  B  =  [-X’lJVj  €  jDi(e0),  X2 n2  £  D2(e o)] .  For  the  first  term,  by  Lemma  3,  using 
arguments  similar  to  (6.37)  we  have 


E 


Wq-Wq 


7 A  =  E[Wq-Wq]  IB  +  0{-h). 


Using  a  Taylor  expansion  for  the  function  Wq  on  B,  and  noting  that  u  is  arbitrary,  it 
follows  from  (4.8),  (4.1 1),  and  Lemma  6.2.2  that 


E 


s2+x ; 


2  JV, 


-Xii v2) 


2 


g2+r2 

2(0-t) 


I  A  = 


(g2+T)2 

4(/3-r)4 


£(*,„, -/?)2 


.  (2/3r-r2+cr2)2  _\2 

+  4(/3_t)4  M^2JV2  -  T) 

+  4(F7F^2(^)-a2)2 

+  %4FiS(^Wl-r)(S2(JV2)-ff2) 

+  °(i¥) 

=  v°(6)  +  <?(&).  □ 
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APPENDIX 


TURBO  PASCAL  CODE  FOR  SIMULATIONS 


program  mul2 3 (indata, outdata, seeddata) ; 

{  $N+ } 

(******************************************************** 


**  Written  by  Kevin  Burns  3  0  October  1997  ** 
**  This  program  implements  the  two-stage  sampling  ** 
**  plan  for  the  function  mul  (irru2  +  mu3 )  and  Bernoulli  ** 
**  populations.  The  goal  is  to  show  that  the  MSE  of  ** 
**  sampling  plan  converges  to  the  minimum  MSE  as  the  ** 
**  sampling  budget  tends  to  infinity.  ** 


uses  rvgen,crt; 

var  X, Y, Z , i , j , nO ,  reps  , mmore , nmore,  lmore  :  integer; 

xbar,ybar,  zbar,  sigmalhat ,  sigma2hat ,  sigma3hat/  f  hat :  real; 

sigmal  #  sigma2  ,  sigma3  #  ms  tar ,  nstar ,  Istar  #  01,02,  c3  ,  Vest :  real  ; 

bigmstar ,  bignstar ,  biglstar ,mopt ,  nopt ,  lopt ,  V,  ratio  :  real  ; 

mul  ,mu2  ,mu3  ,  f ,  f sqr sum,  Vs tdev, upper ,  lower,  fdif  f ,  bvhat :  real  ; 

alpha, f sum:  real; 

indata , outdata , seeddata :  text ; 

outfile:  string[20] ; 

b,  sumx,  sumy,  sumz  ,M,N,  L:  longint; 

procedure  startup; 

(***  This  procedure  reads  in  the  parameter  values,  gets  the  ***) 
(***  initial  random  seeds,  calculates  the  optimal  ***) 

(***  allocation-and  the  minimum  variance,  and  calculates  the***) 
(***  initial  sample  size.  ***) 

begin  (*  startup  *) 

assign  (indata,  ‘ musetup . dat ' )  ; 
assign  (seeddata,  ' seed.dat ' )  ; 
reset  (indata) ; 

(*  read  in  parameter  values  *) 

readln  ( indata , mul ,  mu2  , mu3  ,  b,  reps ,  cl ,  c2  ,  c3  )  ; 
read  (indata, outfile) ; 
assign  (outdata, outfile) ; 
rewrite  (outdata) ; 

(*  calculate  initial  sample  size  *) 
alpha := (0.5+1- In (3) /ln(b) ) /2; 
nO : =  trunc (exp (alpha* In (b) ) ) ; 

writeln  (outdata,  'mul  is  *  ,mul:6:4,’  mu2  is  mu3  is 

1 , mu3 : 6 : 4 ) ; 

writeln (outdata, 'nO  is  1 , nO : 4 , '  b  is  ',b:4,'  alpha  is 
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' , alpha: 6 : 4, '  reps  is  ' ,reps:4); 

writeln (outdata, ' cl  is  ' ,01:6:4, '  c2  is  ',02:6:4,'  c3  is 
' , c3 : 6 : 4 ) ; 
f sum: =  0; 
f sqrsum: =  0; 
f : =  mul*(mu2  +  mu3 )  ; 
sigmal  :=  sqrt (mul* ( 1-mul ) ) ; 
sigma2  :=  sqrt (mu2 * ( l-mu2 ) ) ; 
sigma3  :  =  sqrt (mu3 * ( l-mu3 ) ) ; 

(*  calculate  optimal  allocation  *) 
mopt :  =  (b*sigmal*  (mu2+mu3  )  )  /  (cl*sigmal*  (mu2+mu3  )  +c2*mul*sigma2+ 
c3*mul*  sigma3); 

nopt :  =  (b*mul*sigma2 )  /  (cl*sigmal*  (mu2+mu3  )  +c2  *mul*sigma2+c3  *mul 
*sigma3 ) ; 

lopt :  =  (b*mul*sigma3  )  /  (cl* sigmal*  (mu2+mu3 )  +c2 *mul*sigma2+c3  *mul 
*sigma3 ) ; 

writeln (outdata,  'mopt  is  ', mopt: 8: 6,'  nopt  is  ', nopt: 8: 6,' 
lopt  is  1 , lopt : 8 : 6) ; 

(*  calculate  minimum  variance  *) 

V : =  (sqr (mu2+mu3 ) *sqr (sigmal ) )  /mopt  + 

( sqr  (mul)  *sqr  (sigma2)  )  /nopt  +  (sqr  (mul)  *sqr  (sigma3  )  )  /  lopt; 
reset (seeddata) ; 
getseeds ( seeddata , 3 ) ; 
close  (indata); 

end;  (*  startup  *) 

procedure  ini t samp; 

(***  This  procedure  takes  the  initial  samples  from  the  ***) 

(***  populations.  Then,  it  estimates  the  means  and  standard***) 
(***  deviations.  ***) 

begin  (*  initsamp  *) 

shim  :  =0 ; 
siuny :  =0  ; 
sumz : =0 ; 

for  i : =  1  to_nO  do 

begin  (*  for  loop  to  generate  random  variables  *) 

X  :=  rvbernoulli (mul , 1 ) ; 

sumx : -  sumx  +  X; 

Y  :=  rvbernoulli (mu2 , 2 ) ; 

sumy : =  sumy  +  Y; 

Z  : =  rvbernoulli (mu3 , 3 ) ; 
sumz : =  sumz  +  Z ; 

end;  (*  for  loop  to  generate  random  variables  *) 

(*  estimate  mus  *) 

if  sumx  =  0  then 
xbar : =  1/nO 
else 

if  sumx  =  nO  then 
xbar:=  (nO  -  l)/nO 
else 

xbar:=  sumx/nO; 
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if  sumy  =  0  then 
ybar : =  1/nO 
else 

if  sumy  =  nO  then 
ybar : =  (nO  -  l)/nO 
else 

ybar:=  sumy/nO; 

if  sumz  =  0  then 
zbar:=  1/nO 
else 

if  sumz  =  nO  then 
zbar : =  (nO  -  1 ) /nO 
else 

zbar:=  sumz/nO; 

(*  estimate  sigmas  *) 

sigmalhat  :  =  sqrt (xbar* ( 1-xbar ) ) ; 
sigma2hat  :=  sqrt (ybar* ( 1 -ybar ) ) ; 
sigma3hat  :  =  sqrt ( zbar *( 1-zbar) ) ; 

end;  (*  ini t samp  *) 

procedure  size ; 

(***  This  procedure  calculates  the  final  sample  sizes  ***) 
begin  (*  size  *) 

if  ( sigmalhat+sigma2hat+sigma3hat )  <  0.00001  then 
begin 

mstar  :=b/2; 
nstar  :=b/4; 
lstar  : =b / 4 ; 
end 
else 
begin 

mstar :  =  (b*sigmalhat *  (ybar+zbar)  )  /  (cl*sigmalhat*  (ybar+zbar )  +c2 
*xbar*sigma2hat+c3*xbar*sigma3hat ) ; 
nstar:  =  (b*xbar*sigma2hat )  /  (cl*sigmalhat*  (ybar+zbar)  +c2*xbar 
*sigma2hat+c3 *xbar*  sigma3hat) ; 
lstar:  =  (b*xbar*sigma3hat )  /  (cl*sigmalhat*  (ybar+zbar)  +c2*xbar 
*sigma2hat+c3*xbar*  sigma3hat) ; 

end; 

if  mstar  <=  nO  then  mmore:=0 
else  mmore : =1 ; 

if  nstar  <=  nO  then  nmore:=0 
else  nmore:=l; 

if  lstar  <=  nO  then  lmore:=0 
else  lmore : =1 ; 

case  mmore  +  2*nmore  +  4*lmore  of 

1:  begin  (*  case  1:  mstar  gets  the  rest  *) 
bigmstar:=  (b- (c2+c3) *n0) /cl; 
bignstar:=  nO ; 
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end; 

2:  begin  (*  case  2:  nstar  gets  the  rest  *) 
bignstar : =  (b- (cl+c3 ) *nO ) /c2 ; 
bigmstar :=  nO; 
end; 

3:  begin  (*  case  3:  mstar  and  nstar  split  the  rest  *) 

bigmstar :=  nO  +  (mstar* (b- (cl+c2+c3 ) *nO ))/ (cl*mstar  + 
c2*nstar) ; 

bignstar:=  nO  +  (nstar*  (b-  (cl+c2+c3  )  *nO ))/  (cl*mstar  + 
c2*nstar) ; 

end; 

4:  begin  (*  case  4:  lstar  gets  the  rest  *) 
bigmstar:=  nO; 
bignstar :=  nO ; 
end; 

5:  begin  (*  case  5:  mstar  and  lstar  split  the  rest  *) 

bigmstar : =  nO  +  (mstar* (b- (cl+c2+c3 ) *nO ))/ (cl*mstar  + 
c3*lstar) ; 
bignstar :=  nO ; 
end; 

6:  begin  (*  case  6:  nstar  and  lstar  split  the  rest  *) 

bignstar :=  nO  +  (nstar* (b- (cl+c2+c3 ) *nO) )/ (c2*nstar  + 
c3*lstar) ; 
bigmstar: =  nO ; 
end; 

7:  begin  (*  case  7:  they  all  get  more  *) 
bigmstar :=  mstar; 
bignstar:=  nstar; 
end; 

end;  (*  end  case  statement  *) 

biglstar : =  (b  -  cl*bigmstar  -  c2 *bignstar ) /c3 ; 

M:=  trunc (bigmstar )  ; 

N :  =  trunc  (bignstar)'; 

L:=  trunc ( (b  -  cl*M  -  c2*N)/c3); 

end;  (*  size  *) 

procedure  f inalsamp; 

(***  This  procedure  samples  from  the  populations  again  ***) 
var  restx, resty, restz : integer; 

begin  (*  f inalsamp  *) 

restx :=  M  -  nO ; 

resty:=  N  -  nO ; 

restz :=  L  -  nO ; 

if  restx  >  0  then 

begin  (*  if  restx  >  0  *) 
for  i:=  1  to  restx  do 

begin  (*  for  loop  to  generate  X's  *) 

X  :=  rvbernoulli (mul , 1) ; 
if  X  =  1  then 
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sumx : =  sumx  +  1  ; 

end;  (*  for  loop  to  generate  X's  *) 
xbar:=  sumx/M; 
end;  (*  if  restx  >  0  *) 

if  resty  >  0  then 
begin  (*  if  resty  >  0  *) 
for  i:=  1  to  resty  do 

begin  (*  for  loop  to  generate  Y's  *) 

Y  : =  rvbernoulli (mu2 ,  2 )  ; 
if  Y  =  1  then 

sumy :  =  suray  +  1  ; 

end;  (*  for  loop  to  generate  Y's  *) 
ybar:=  sumy/N; 
end;  (*  if  resty  >  0  *) 

if  restz  >  0  then 
begin  (*  if  restz  >  0  *) 
for  i : =  1  to  restz  do 

begin  (*  for  loop  to  generate  Z's  *) 

Z  :=  rvbernoulli (mu3 , 3 ) ; 
if  Z  =  1  then 

sumz  : =  sumz  +  1 ; 

end;  (*  for  loop  to  generate  Z's  *) 
zbar:=  sumz/L; 
end;  (*  if  restz  >  0  *) 

end;  (*  finalsamp  *) 

procedure  calculate ; 

(***  This  procedure  estimates  the  mean  square  error  with  a  ***) 
(***  confidence  interval  and  then  compares  the  estimated  ***) 
(***  mean  sqare  error  to  the  minimum  variance.  ***) 

begin  (*  calculate  *) 

Vest:=  f sum/reps; 
bvhat:=  b*Vest; 

writeln (outdata,  'The  actual  minimum  MSE,  V  =  ',VilO:8); 
writeln (outdata, 'The  estimated  mean  squared  error,  Vest  = 

' , Vest : 10 : 8) ; 

writeln (outdata, 'b  times  the  estimated  mean  squared  error 
is  ' , bvhat : 10 : 8 ) ; 
ratio :=  Vest/V; 

writeln (outdata, ' ratio  is  ' , ratio : 8 : 6 ) ; 

Vstdev : =  sqrt ( ( f sqrsum  -  (sqr ( f sum) /reps ))/ (reps  -  1)); 

lower :=  (Vest  -  (1 . 96*Vstdev/ (sqrt (reps) ))) /V; 

upper : =  (Vest  +  (1 . 96*Vstdev/ (sqrt (reps) ) ) ) /V; 

writeln (outdata, ' 95%  lower  and  upper  confidence  limits 

for  ratio  are  [ ' , lower : 10 : 8 ,  '  1 , upper : 10 : 8 ,  '  ]  1  )  ; 

rewrite  (seeddata) ; 

for  i:=  1  to  3  do 

writeln (seeddata, seed(i)  :  12  )  ; 
end;  (*  calculate  *) 


■k  ic  -k  -k 


begin  (****  main  program 
startup; 

for  j : =  1  to  reps  do 

begin  (*  main  for  loop  *) 
initsamp; 
size; 

finalsamp; 

fhat:=  xbar* (ybar  +  zbar) ; 
fdiff:=  sqr(fhat  -  f )  ; 
f  sum:  =  fsum  +  fdiff; 
fsqrsum:=  fsqrsum  +  sqr ( fdiff) 
end;  (*  main  for  loop  *) 
calculate; 
close (outdata)  ; 
close  (seeddata)  ; 
end.  (*  main  program  *) 


64 


program  mgl (indata, outdata, seeddata) ; 

{$N+} 

(************************************************************** 
**  Written  by  Kevin  Burns  28  February  1998  ** 

**  This  program  implements  the  two-stage  sampling  plan  for  ** 

**  the  mean  waiting  time  in  the  M/G/l  queue  when  the  ** 

**  service  times  are  exponential.  The  goal  is  to  show  ** 

**  that  the  MSE  of  the  sampling  plan  converges  to  the  ** 

**  minimum  MSE  as  the  sampling  budget  tends  to  infinity.  ** 

**************************************************************) 
uses  rvgen,crt; 

var  i,j,nO,reps:  integer; 

xbar ,  ybar ,  ysqrsum,  ycubsum,  y4sum,  sigsqrhat, subl, sub2  ,  sub3  , 
ms  tar ,  nstar  ,01,02,  Vest ,  alpha ,  delta ,  phi ,  kappa3  ,  kappa4 ,  rho , 
bigms tar ,  bigns tar ,  mopt ,  nopt ,  V,  ratio ,  f  sum,  del tahat ,  phihat , 
beta,  tau,  sigmasqr ,  f ,  fsqrsum,  Vs tdev, upper,  lower,  fdif f ,  fhat , 
X,  Y,  sumx,  sumy,  sublhat ,  sub2hat ,  kappa3hat ,  kappa4hat :  real  ; 
indata, outdata, seeddata: text; 
outfile:string[20] ; 
b,M,N:  longint; 

procedure  startup; 

(***  This  procedure  reads  in  the  parameter  values,  gets  the  ***) 
(***  initial  random  seeds,  calculates  the  optimal  allocation***) 
(***  and  the  minimum  variance,  and  calculates  initial  sample***) 
(***  size.  ***) 

begin  (*  startup  *) 

assign  (indata, 'setup.dat') ; 
assign  (seeddata, 1 seed2 . dat ' ) ; 
reset  (indata) ; 

(*  read  in  parameter  values  *) 

readln  (indata, beta,  tau,  sigmasqr ,  kappa3  ,  kappa4 ,  rho, b,  reps , 
cl , c2 ) ; 

read  (indata, outfile) ; 
assign  (outdata, outfile) ; 
rewrite  (outdata) ; 

(*  calculate  initial  sample  size  *) 
alpha : = ( 0 . 5+1 -In (2 ) /In (b) ) /2; 
nO : =  trunc (exp (alpha* In (b) ) ) ; 

writeln (outdata,  ’beta  is  '/beta^^,1  tau  is  1  ,tau:6:4,' 
sigma  squared  is  ’,sigmasqr:6:4); 

writeln (outdata, ’ kappa3  is  1 , kappa3 : 6 : 4 , '  kappa4  is 
1 , kappa4 : 6 : 4 , '  rho  is  ' , rho : 6 : 4 ) ; 

writeln  (outdata,  '  alpha  is  ’  ,  alpha :  6  :  4 ,  '  nO  is  1  ,  nO  :  4 ,  '  b  is 
',b:4,'  reps  is  ',reps:4); 

writeln (outdata, ' cl  is  ',01:6:4,’  c2  is  ',c2:6:4); 
f sum: =0 ; 
fsqrsum: =0 ; 

f : =  (sigmasqr+sqr (tau) ) / (2* (beta-tau) ) ; 
subl : =2*beta* (kappa3* (3*sqr (tau) 

-sigmasqr) +tau* (kappa4+sigmasqr* (2*sqr (tau) - 
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3  *sigmasqr) ) ) ; 

sub2  :  =2*kappa3*tau*  (sqr  (tau)  -sigmasqr )  +kappa4*sqr  (tan)  ; 
sub3 : =2*beta*tau-sqr ( tau) +sigmasqr; 

delta  :  =sqrt  (sqr  (beta)  *  (4*kappa3*tau+kappa4+4*sqr  (tau)  *sigmasqr 
-sqr (sigmasqr) ) -subl+sub2+sqr (sqr ( tau) ) *sigmasqr 
-3  *sqr (tau) *sqr (sigmasqr) +sigmasqr*sqr (sigmasqr) ) ; 
phi : =sqr (beta) * (4*kappa3*tau+kappa4 

-sqr (sqr (tau) ) +2 *sigmasqr* (sqr (tau) -sigmasqr) ) 
-subl+sub2+sigmasqr* (sqr (sqr (tau) ) 

-3* sqr (tau) *sigmasqr+sqr (sigmasqr) ) ; 

(*  calculate  optimal  allocation  *) 
mopt  : = (b*beta* (sqr (tau) +sigmasqr ) * (delta- 
beta*  (sqr (tau) +sigmasqr) ) ) /phi ; 
nopt  :=  b  -  mopt; 

writeln (outdata,  'mopt  is  '  ,mopt :8:6,'  nopt  is  ' ,nopt:8:6); 

(*  calculate  minimum  variance  *) 

V: =  (sqr ( sigmasqr+sqr ( tau) ) *sqr (beta) ) / (4*sqr (sqr (beta 
-tau) ) *mopt) + (sqr (sub3 ) *sigmasqr) / (4*sqr (sqr (beta 
-tau) ) *nopt ) + (kappa4-sqr (sigmasqr) ) / (4*sqr (beta 
-tau) *nopt) + (sub3*kappa3) / (nopt *2* (beta- tau) *sqr (beta 
-tau) ) ; 

reset (seeddata) ; 
get seeds (seeddata, 2 )  ; 
close  (indata) ; 

end;  (*  startup  *) 

procedure  ini tsamp ; 

(***  This  procedure  takes  the  initial  samples  from  the  ***) 

(***  populations.  Then,  it  estimates  the  means  and  higher  ***) 
(***  moments.  ***) 

begin  (*  initsamp  *) 

sumx : =0 ; 
sumy : =0 ; 
ysqrsum: =0 ; 
ycubsxim:  =0 ; 
y4sum:=0; 

for  i:=  1  to  nO  do 

begin  (*  for  loop  to  generate  random  variables  *) 

X  :=  rvexpon (beta, 1 ) ; 

sumx:=  sumx  +  X; 

Y  : =  rvexpon ( tau , 2 ) ; 
sumy : =  sumy  +  Y; 
ysqrsum:  =ysqr  sum  +  sqr(Y); 
ycubsum:  =ycubsum  +  Y*sqr(Y); 
y4sum:=y4sum  +  sqr(sqr(Y)); 
end;  (*  for  loop  to  generate  random  variables  *) 

(*  estimate  means  *) 
xbar : =  sumx/nO; 
ybar:=  sumy/nO; 

(*  estimate  higher  moments  *) 

sigsqrhat  : =  (ysqrsum- (sqr (sumy) /nO ) ) / (nO-1 ) ; 


kappa3hat  :=  ycubsum/nO  -  3 *ysqrsum*ybar/nO  +  2 *ybar*sqr (ybar ) 
kappa4hat  :=  y4sum/n0  -  4*ycubsum*ybar/n0  + 

6*ysqrsum*sqr (ybar) /nO  -  3*sqr  (sqr  (ybar )  )  ; 

end;  (*  initsamp  *) 

procedure  size; 

(***  This  procedure  calculates  the  final  sample  sizes  ***) 
begin  (*  size  *) 

sublhat : =  2*xbar* (kappa3hat* {3*sqr (ybar) 

-sigsqrhat) +ybar* (kappa4hat+sigsqrhat* (2*sqr (ybar) 
-3*sigsqrhat) ) ) ; 

sub2hat : =  2*kappa3hat*ybar* (sqr (ybar) 

-sigsqrhat) +kappa4hat* sqr (ybar) ; 
deltahat:=  sqrt (sqr (xbar) * (4*kappa3hat*ybar+kappa4hat+4* 
sqr (ybar) *sigsqrhat-sqr (sigsqrhat) ) 
-sublhat+sub2hat+sqr (sqr (ybar) ) *sigsqrhat 
-3*sqr (ybar) * sqr (sigsqrhat ) + 
sigsqrhat *sqr (sigsqrhat ) ) ; 
phihat : =  sqr (xbar) * (4*kappa3hat*ybar+kappa4hat 

-sqr (sqr (ybar) ) +2 *sigsqrhat* (sqr (ybar) -sigsqrhat) ) 
-sublhat+sub2hat+sigsqrhat* (sqr (sqr (ybar ) ) 

-3*sqr (ybar) *sigsqrhat+sqr (sigsqrhat) ) ; 

mstar : =  (b*xbar* (sqr (ybar) +sigsqrhat ) * (deltahat 
-xbar* (sqr (ybar) +sigsqrhat) ) ) /phihat; 

if  mstar  <=  nO 

then  bigmstar:=nO 
else 

begin  (*  else  *) 

if  mstar  >=  ( (b-c2*n0 ) /cl) 

then  bigmstar:= ( (b-c2*n0) /cl) 
else  bigmstar : =mstar ; 
end;  (*  else  *) 

bignstar:=  (b  -  cl*bigmstar) /c2 ; 

M:=  trunc (bigmstar ) ; 

N:=  trunc ( (b  -  (cl*M))/c2); 

end;  (*  size  *) 

procedure  f inalsamp; 

(***  This  procedure  samples  from  the  populations  again  ***) 
var  restx, resty: integer ; 

begin  (*  f inalsamp  *) 
restx :=  M  -  nO ; 
resty:=  N  -  nO ; 

if  restx  >  0  then 

begin  (*  if  restx  >  0  *) 
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for  i:=  1  to  restx  do 

begin  (*  for  loop  to  generate  X's  *) 

X  :=  rvexpon  (beta,  1)  ; 
sumx:  =  sumx  +  X; 

end;  (*  for  loop  to  generate  X's  *) 
xbar:=  sumx/M; 
end;  (*  if  restx  >  0  *) 

if  resty  >  0  then 

begin  (*  if  resty  >  0  *) 

for  i:=  1  to  resty  do 

begin  (*  for  loop  to  generate  Y's  *) 

Y  : =  rvexpon  ( tau ,2)  ; 
sximy  :=  sumy  +  Y; 
ysqrsum:  =ysqrsum  +  sqr(Y); 
ycubsum:  =ycubsum  +  Y*sqr(Y); 
y4sum:=y4sum  +  sqr(sqr(Y)); 
end;  (*  for  loop  to  generate  Y's  *) 
ybar:=  sumy/N; 

sigsqrhat  :  =  (ysqrsum-  (sqr  (sumy)  /N)  )  /  (N-l )  ; 
end;  (*  if  resty  >  0  *) 
end;  (*  finalsamp  *) 
procedure  calculate ; 

(***  This  procedure  estimates  the  mean  square  error  with  a  ***) 
(***  confidence  interval  and  then  compares  the  estimated  ***) 
(***  mean  sqare  error  to  the  minimum  variance.  ***) 

begin  (*  calculate  *) 

Vest:=  fsum/reps; 

writeln (outdata,  'The  actual  minimum  MSE,  V  =  *  ,V:10:8); 
writeln  (outdata,  'The  estimated  mean  squared  error,  Vest  = 

'  ,Vest : 10 : 8)  ; 
ratio :=  (Vest-V) /V; 

writeln (outdata,  *  ratio  is  ' , ratio : 8 : 6 ) ; 

Vstdev:=  sqrt  (  ( f  sqrsum  -  ( sqr  ( f  sum) /reps ))/ (reps  -  1)  )  ; 
lower  :=  ((Vest-V)  -  (1.96*Vstdev/ (sqrt  (reps)  )  )  ) /V; 
upper  :=  ((Vest-V)  +  (1.96*Vstdev/ (sqrt  (reps)  ))) /V; 
writeln (outdata, ' 95%  lower  and  upper  confidence  limits  for 
ratio  are  [ ' , lower : 10 : 8,  '  ' , upper : 10 : 8 ,  ' ]  '  )  ; 

rewrite  (seeddata) ; 
for  i:=  1  to  2  do 

writeln (seeddata, seed(i)  :  12 )  ; 
end;  (*  calculate  *) 

begin  (****  main  program  ****) 
startup; 

for  j:=  1  to  reps  do 

begin  (*  main  for  loop  *) 
initsamp; 
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size; 

final samp; 

if  ybar  <=  rho*xbar  then 

fhat : =  (sigsqrhat+sqr (ybar) ) / (2* (xbar-ybar ) ) 
else 
begin 

fhat : =  (rho*ybar* (sigsqrhat/sqr (ybar) +1) ) / (2* (1 
-rho) ) ; 

end; 

fdif f : =  sqr (fhat  -  f ) ; 
f sum: =  f sum  +  fdiff; 
fsqrsum:=  fsqrsum  +  sqr (fdiff); 
end;  (*  main  for  loop  *) 
calculate; 
close (outdata) ; 
close  (seeddata) ; 
end.  (*  main  program  *) 


KEVIN  EDWARD  BURNS 

Optimal  Two  Stage  Procedures  for  Estimating  Functions  of  Parameters  in  Reliability  and 
Queueing  Models 

(Under  the  direction  of  ANDREW  F.  SEILA) 

In  this  dissertation,  we  consider  the  problem  of  estimating  functions  of  parameters 
found  in  reliability  and  queueing  models.  The  problem  is  to  allocate  a  fixed  sampling 
budget  among  the  populations  with  the  goal  of  minimizing  the  mean  squared  error  (MSE) 
of  the  estimator.  We  consider  the  reliability  model  with  three  components  such  that  the 
probability  the  system  works  is  /(/x2 ,  //2,  /u)  =  Pi(/U  +  ^3),  and  the  mean  waiting  time 
of  the  M/G/l  queue.  For  each  of  these  models,  we  consider  a  set  of  sample  sizes  referred 
to  as  a  first-allocation  procedure  which  minimizes  the  first-order  approximation  to  the 
MSE.  Since  the  first-order  allocation  procedure  depends  on  the  unknown  parameters  in 
the  model,  we  propose  a  two-stage  procedure  in  which  we  first  use  a  fraction  of  the 
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sizes  from  the  first-allocation  procedure  goes  to  zero  as  the  budget  goes  to  infinity. 
Simulations  are  used  to  demonstrate  the  asymptotic  optimality  results  for  the  two  stage 
procedures.  The  empirical  studies  show  that  the  two  stage  estimation  procedures  work 
well  for  reasonable  sample  sizes. 

INDEX  WORDS:  Allocation,  Asymptotic,  Distribution,  Estimator,  Exponential, 

Infinite,  Mean  squared  error,  Model,  Optimal,  Parameters,  Queue, 
Sample  Size,  Simulation 


